How Much Data Collection Is Too Much? Inside the Murky World of Predictive Analytics

Facebook’s 2.1 billion monthly active users likely had some notion the information they share on the network is mined for ad targeting. However, news of Cambridge Analytica accessing data from the accounts of not just those who downloaded a personality prediction app, but also “friends who had their privacy settings set to allow it,” likely has many of those users pondering data and privacy in the digital era like never before. This breach is far from new, though. Marketers have long had access to far more digital data about consumers than most realize, which powers sometimes unnervingly accurate predictive analytics. In 2003—a year before Facebook was founded—Target used shopping behavior to determine a Minnesota teenager was pregnant before her father did. This infamous example of predictive analytics is just one of the unintended consequences of tapping into data about behaviors, locations and transactions to pinpoint patterns around what consumers are likely to do. In the time it took that Minnesota teenager to raise her own teen, the amount of data brands like Facebook have at their fingertips has only grown exponentially, and they’re investing more on predictive analytics and machine-learning software to boot: Forrester Research said spending will increase from $3.5 billion in 2016 to approximately $6.2 billion by 2020. “I like to say, 'I once believed in fate and then I understood predictive analytics,’” said tech ethicist David Polgar. "Every ad is tailored to everything you’ve done and trying to predict what you want to purchase." [kickout id="1"] Brand use runs the gamut from Netflix recommending content based on viewing behavior to insurance companies offering discounts by tracking driving behavior, although a 2016 Wall Street Journal report found only 25 percent of new Progressive customers were opting in because some drivers found it creepy. Creepy is a word that comes up a lot in predictive analytics. “I think brands sometimes are two years ahead of the average consumer, who doesn’t necessarily understand all of the predictive analytics and biometrics and how they are gleaned for advertising,” Polgar said. Consumers must decide whether targeted messaging is worth the loss of privacy or if they will start to care about what exactly is being tracked and how it's being used. Michael Horn, managing director of data science at digital marketing agency Huge, said there’s nothing inherently illegal in the U.S. about using data from quizzes for predictions and targeting, but the Cambridge Analytica example forces us to think about permission and ethical use. And the addition of voice-enabled devices complicates these issues even further. Terms of service agreements from Amazon and Google are clear their devices are not always listening and users can delete recordings at any time. Nevertheless, last October, reports emerged that a hardware flaw caused some Google Home Mini units to constantly record (which Google later fixed). And, more recently, Alexa began laughing of her own volition. (Amazon, too, rolled out a fix.) Like Target, even if Amazon, Google and others aren’t intentionally spying on consumers, when shoppers see ads online related to topics they’ve discussed, it sometimes feels like an invasion of privacy because those platforms are in all likelihood using data consumers do not realize they’ve provided. “Even if we've opted to have these devices in our home, we may not feel like we've opted into it being an always-on device,” said Pete Meyers, marketing scientist at marketing software company Moz. “It's a different kind of exchange than when we enter data into a form online or give permission for an app to connect with one of our accounts.” And that’s true even if each individual data point is not objectionable in and of itself. “You start seeing things that are creepy—and the reality is it’s just coming from a massive amount of data,” Meyers said. Polgar said Facebook pushback could result in new conversations about the misalignment between business and human interests and garner more support for an analysis of the ad-based model of social media, as well as greater awareness about how consumer data is being used overall. Meyers, on the other hand, is skeptical consumer behavior will drastically change. "We want what Facebook has to offer and there's no viable competitor," he said. "We've shown our willingness to trade privacy for convenience time and time again." Noting consumer data literacy is a vital issue, Horn said while one incident will increase awareness, it will not fundamentally shift the landscape. "People still want free content and free social media, which means ad-supported platforms," he added. "What will shift the balance of power and ethical practices is marketers stepping up and demanding greater transparency and respect for users' privacy and data security from their vendors and partners."