7 Questions Marketers Should Consider When Weighing the Quality of Their Data

Sets of numbers can be misunderstood if they lack integrity

Many providers continue to use stale data.
Illustration: Gary Waters/Getty Images/Ikon Images

The media landscape has realigned itself with lightning speed to the power of data. After a century of being limited by “brute force media,” marketers quickly glommed onto the vast potential of digital and addressable audiences.

Julie Fleischer
Headshot: Alex Fine

The data “exhaust” from the digital experience has been a game changer for marketers, powered first by sites, then by ad networks and finally by programmatic. Money shifted from reach-based to “people-based” planning, augmented by powerful new data companies, monetizing the categories and groupings of people brands want to reach in ever finer granular detail.

But somewhere along the way, the proposition fractured. We discovered that data itself is not the key to addressable marketing and better business outcomes—quality data is. And the difference between data and quality data, or data with integrity, is difficult to see. To use a buzzword of the day, the market for data is not transparent. To use another, it’s a swamp: an opaque, poorly understood mess. If you want to be a data-driven marketer, you need to make your way through the morass to interrogate your data. So here are seven questions to keep in mind in assessing data integrity:

Is your data fresh or stale? The average life of a cookie is 30 days. About 55 million people change their phone carrier every year, 60 million physically move, 43 percent of customer records are out of date or invalid, and 60 percent of data is incorrect within two years.

Data becomes obsolete quickly, yet many providers continue to use stale data because it provides the illusion of scale. The only data that matters is accurate data. Make sure you understand how data is collected, whether it’s corroborated against authoritative identity standards, and how often obsolete data is purged.

Is your data 3-D or flat? In the world of data, there are six key areas that matter to marketers: demo (age, gender, income); geographic (where they live/roam); attention (what they concentrate on); consumption (what they buy); behavioral (what matters to them); and intentional (what they’re about to do).

We discovered that data itself is not the key to addressable marketing and better business outcomes—quality data is.

Data providers act as if people exist independently in each of these areas, as if any of the above is sufficient to define a person. I’d ask, are you just a demo? Just a measure of attention? Just a signal of intent? No. Real humans are a combination of all of the above. Collectively, consumers are diverse. Individually, they are multifaceted. Flat data (an individual attribute) is just a signal.

How modeled is your data? Here’s a truth: All data is modeled. Here’s another: At some point, models falter. Do you know at which point?

In order to be useful, data needs to have scale. Marketers seek a balance of specificity and reach. It’s important to understand the size of the initial seed audience versus the size of the total audience to develop a degree of confidence in the data you’re using. If it’s significantly modeled, how certain can you be that you’re still reaching your target?

How transparent is the modeling? Do you know your look-alikes? The data market tends to be opaque, and with data, the devil is in the details. If you don’t know how a look-alike audience is formed, you have no idea whether it can be trusted.

For example, most data sets use only digital identifiers and connections. Definitive email-to-cookie linkages generate only a 30 to 50 percent match rate. So the data you’re starting with may be less than half right. Statistical modeling creates hypothetical look-alikes off the total (which is less than half right), exacerbating the issue. If you don’t know the model, you can’t interrogate the veracity of the data set.

Is your data connected? Most data is digital, but I don’t know of a single person who lives life online only. The world is connected—online and offline. Connected data encompasses both. Most data is based on digital attributes only and is neither linked to offline identity nor normalized versus the population. In other words, it captures a small portion of reality. Data needs to be connected to reflect people’s 3-D lives.

Are you targeting individuals or households? Unless you’re targeting age or gender, you’re better off targeting households than individuals. Here’s an example: It’s amazing how many marketers still target women instead of adults, as if only women are shoppers. Today, 40 percent of primary grocery shoppers are men, and the majority of households share grocery shopping chores.

Targeting only individuals misses a big portion of the grocery-shopping population. Worse yet, most purchase data is generated from shopper card data, which exists at the household level. But generalizing the data from individual to households requires a connection to the offline world (see prior question). Does your data capture this?

Finally, how many profiles are there? There are 220 million adults in the United States. If your data provider has 3 billion profiles, it isn’t marketing to people, it’s marketing to data points. The data stream that we rely on as marketers grows exponentially each year. Today, there are more IP addresses for devices than people. New ways of parsing, organizing and leveraging data will be invented that will make the media landscape even more addressable and exciting.

But buyer beware, if you don’t kick the data tires and get a more complete understanding of the modern principles of data integrity, you may just be getting crap.

Julie Fleischer (@JFLY) is vp, product marketing at Neustar.

This story first appeared in the April 24, 2017, issue of Adweek magazine. Click here to subscribe.