When It Comes to Understanding Consumer Behavior, Measuring Time Is of the Essence

An accurate way to predict future action and verify user identity

Data is top-of-mind for every ad buyer and brand marketer. You can certainly understand why, since the successful use of data has become the competitive advantage for savvy digital advertisers. In fact, when you think about it, the common theme across virtually all successful digital marketing and advertising models is their “data driven-ness.”

Let’s dig a bit deeper into which data elements are most valuable for a digital marketer.

Considering data attributes

By data, we’re looking at the information used to match the goals of a campaign to potential recipients of the creative messaging. This information is largely embedded in ad-request signals emanating from media available for placing the ads. For example, your typical programmatic bid request (BRQ) contains 50-plus attributes, with the goal of evaluating context, including both demographic attributes and behavioral properties.

While all of the attributes passed in a BRQ are necessary to evaluate context, two are considered particularly useful for user modeling: the location of the device and the media (e.g., mobile app) from which the ad-request originated.

For example, if a user is accessing DIY apps and the device is seen at a Home Depot, it would be logical to target this person with an ad for a home improvement product. Data can also provide contextual information: An ice cream ad is better targeted to a person in Miami (location) on a sunny afternoon (weather data) than a consumer in New York City in snowy January.

The value of time

Now let’s add time data, not commonly used for ad targeting, into this equation.

The power of the time dimension is its ability to identify persistence of behavior. If a user performs an action one time, it does not say much, but an action performed repeatedly over time indicates habit, and is very valuable. Habits are proven to be both a great predictor of future action as well as a more accurate way to verify existing user information.

A consumer’s life journey is only identifiable when observed persistently.

Age and gender, for example, are notoriously difficult to identify when observing location and app usage in a vacuum. But when you are able to observe both over time, patterns emerge that tell a more concrete story. A user that consumes women’s fashion content one time, may or may not be female, but a user that used a women’s fashion app hundreds of times over 6 months is almost surely female. In general, a consumer’s life journey is only identifiable when observed persistently. One year a consumer is accessing house-hunting apps, and the next they’re consistently using pregnancy apps. This consumer’s evolving life journey is most actionable when analyzed over time.

Taking time into consideration in data analysis reveals key behavioral traits that would otherwise remain hidden. But such analysis is still relatively uncommon. This is puzzling, especially when you consider the data elements that enable persistent analysis are already embedded in ad requests—timestamps and location.

Making room for time data

The problem is not simply the availability of time data, but the complexity of the underlying processes necessary to compute persistence. The first step is to store ad requests over time. To identify regular churchgoers, for instance, you must figure out which devices have appeared in churches weekly over an extended period of, say, six months. This requires at least six months of stored ad requests.

In our experience, two years of stored history is adequate to perform the vast majority of persistence analysis. This is a gargantuan task as it requires an amount of data that is impossible to store economically and process effectively. For example, the U.S. mobile programmatic ecosystem generates about 100 billion BRQs daily, resulting in about 100TB of raw data. Accumulated over two years, the raw volume exceeds exabyte levels. The economics of storage makes such accumulation unfeasible.

So how do you make storing this amount of data—and doing this kind of analysis—viable? We believe compression provides a promising direction. Compression is a well-studied problem and a variety of techniques have been invented for different scenarios, but the reduction factors are small relative to our requirements. A 5:1 reduction is considered a home run in traditional applications (2:1 and 3:1 are more common), whereas our industry needs reductions of 50:1 or higher to be practical for this scale of data storage.

At Mobilewalla, we are working hard at inventing an entirely new class of “semantic” compression schemes that take advantage of the structural uniqueness of ad supply data and make such reductions feasible. Using this compression scheme, we are able to observe billions of devices over months and years, resulting in the deeper understanding of evolving consumer behavior, and also gaining nuanced understandings of marketplaces and technologies.

The advertising industry already understands the importance of time when it comes to the “when” of engaging their consumers, But to gain a holistic understanding of the consumer life journey, time is truly of the essence when it comes to mobile advertising data.