Why Model?

Why model? Uh, because someone is ridiculously good looking, like Derek Zoolander? No, seriously, why model when we have so much data around?

The short answer is because we will never know the whole truth. That would be the philosophical answer. Physicists construct models to make new quantum field theories more attractive theoretically and more testable physically. If a scientist already knows the secrets of the universe, well, then that person is on a first-name basis with God Almighty, and he or she doesn’t need any models to describe things like particles or strings. And the rest of us should just hope the scientist isn’t one of those evil beings in “Star Trek.”

Another answer to “why model?” is because we don’t really know the future, not even the immediate future. If some object is moving toward a certain direction at a certain velocity, we can safely guess where it will end up in one hour. Then again, nothing in this universe is just one-dimensional like that, and there could be a snowstorm brewing up on its path, messing up the whole trajectory. And that weather “forecast” that predicted the snowstorm is a result of some serious modeling, isn’t it?

What does all this mean for the marketers who are not necessarily masters of mathematics, statistics or theoretical physics? Plenty, actually. And the use of models in marketing goes way back to the days of punch cards and mainframes. If you are too young to know what those things are, well, congratulations on your youth, and let’s just say that it was around the time when humans first stepped on the moon using a crude rocket ship equipped with less computing power than an inexpensive passenger car of the modern days.

Anyhow, in that ancient time, some smart folks in the publishing industry figured that they would save tons of money if they could correctly “guess” who the potential buyers were “before” they dropped any expensive mail pieces. Even with basic regression models—and they only had one or two chances to get it right with glacially slow tools before the all-too-important Christmas season came around every year—they could safely cut the mail quantity by 80 percent to 90 percent. The savings added up really fast by not talking to everyone.

Fast-forward to the 21st Century. There is still a beauty of knowing who the potential buyers are before we start engaging anyone. As I wrote in my previous columns, analytics should answer:

1. To whom you should be talking; and
2. What you should offer once you’ve decided to engage someone.

At least the first part will be taken care of by knowing who is more likely to respond to you.

But in the days when the cost of contacting a person through various channels is dropping rapidly, deciding to whom to talk can’t be the only reason for all this statistical work. Of course not. There are plenty more reasons why being a statistician (or a data scientist, nowadays) is one of the best career choices in this century.

Here is a quick list of benefits of employing statistical models in marketing. Basically, models are constructed to:

  • Reduce cost by contacting prospects more wisely
  • Increase targeting accuracy
  • Maintain consistent results
  • Reveal hidden patterns in data
  • Automate marketing procedures by being more repeatable
  • Expand the prospect universe while minimizing the risk
  • Fill in the gaps and summarize complex data into an easy-to-use format—A must in the age of Big Data
  • Stay relevant to your customers and prospects

We talked enough about the first point, so let’s jump to the second one. It is hard to argue about the “targeting accuracy” part, though there still are plenty of non-believers in this day and age. Why are statistical models more accurate than someone’s gut feeling or sheer guesswork? Let’s just say that in my years of dealing with lots of smart people, I have not met anyone who can think about more than two to three variables at the same time, not to mention potential interactions among them. Maybe some are very experienced in using RFM and demographic data. Maybe they have been reasonably successful with choices of variables handed down to them by their predecessors. But can they really go head-to-head against carefully constructed statistical models?

What is a statistical model, and how is it built? In short, a model is a mathematical expression of “differences” between dichotomous groups. Too much of a mouthful? Just imagine two groups of people who do not overlap. They may be buyers vs. non-buyers; responders vs. non-responders; credit-worthy vs. not-credit-worthy; loyal customers vs. attrition-bound, etc. The first step in modeling is to define the target, and that is the most important step of all. If the target is hanging in the wrong place, you will be shooting at the wrong place, no matter how good your rifle is.

And the target should be expressed in mathematical terms, as computers can’t read our minds, not just yet. Defining the target is a job in itself:

  • If you’re going after frequent flyers, how frequent is frequent enough for you? Five times a year or 10 times a year? Or somewhere in between? Or should it remain continuous?
  • What if the target is too small or too large? What then?
  • If you are looking for more valuable prospects, how would you express that? In terms of average spending, lifetime spending or sheer number of transactions?
  • What if there is an inverse relationship between frequency and dollar spending (i.e., high spenders shopping infrequently)?
  • And what would be the borderline number to be “valuable” in all this?

Once the target is set, after much pondering, then the job is to select the variables that describe the “differences” between the two groups. For example, I know how much marketers love to use income variables in various situations. But if that popular variable does not explain the differences between the two groups (target and non-target), the mathematics will mercilessly throw it out. This rigorous exercise of examining hundreds or even thousands of variables is one of the most critical steps, during which many variables go through various types of transformations. Statisticians have different preferences in terms of ideal numbers of variables in a model, while non-statisticians like us don’t need to be too concerned, as long as the resultant model works. Who cares if a cat is white or black, as long as it catches mice?

Not all selected variables are equally important in model algorithms, either. More powerful variables will be assigned with higher weight, and the sum of these weighted values is what we call model score. Now, non-statisticians who have been slightly allergic to math since the third grade only need to know that the higher the score, the more likely the record in question is to be like the target. To make the matter even simpler, let’s just say that you want higher scores over lower scores. If you are a salesperson, just call the high-score prospects first. And would you care how many variables are packed into that score, for as long as you get the good “Glengarry Glen Ross” leads on top?

So, let me ask again. Does this sound like something a rudimentary selection rule with two to three variables can beat when it comes to identifying the right target? Maybe someone can get lucky once or twice, but not consistently.

That leads to the next point, “consistency.” Because models do not rely on a few popular variables, they are far less volatile than simple selection rules or queries. In this age of Big Data, there are more transaction and behavioral data in the mix than ever, and they are far more volatile than demographic and geo-demographic data. Put simply, people’s purchasing behavior and preferences change much faster than family composition or their income, and that volatility factor calls for more statistical work. Plus, all facets of marketing are now more about measurable results (ah, that dreaded ROI, or “Roy,” the way I call it), and the businesses call for consistent hitters over one-hit wonders.

“Revealing hidden patterns in data” is my favorite. When marketers are presented with thousands of variables, I see a majority of them just sticking to a few popular ones all the time. Some basic recency and frequency data are there, and among hundreds of demographic variables, the list often stops after income, age, gender, presence of children, and some regional variables. But seriously, do you think that the difference between a luxury car buyer and an SUV buyer is just income and age? You see, these variables are just the ones that human minds are accustomed to. Mathematics do not have such preconceived notions. Sticking to a few popular variables is like children repeatedly using three favorite colors out of a whole box of crayons.

I once saw a neighborhood-level U.S. Census variable called “% Households with Septic Tanks” in a model built for a high-end furniture catalog. Really, the variable was “percentage of houses with septic tanks in the neighborhood.” Then I realized it made a lot of sense. That variable was revealing how far away that neighborhood was located in comparison to populous city centers. As the percentage of septic tanks increased, the further away the residents were from the city center. And maybe those folks who live in scarcely populated areas were more likely to shop for furniture through catalogs than the folks who live closer to commercial areas.

This is where we all have that “aha” moment. But you and I will never pick that variable in anything that we do, not in million years, no matter how effective it may be in finding the target prospects. The word “septic” may scare some people off at “hello.” In any case, modeling procedures reveal hidden connections like that all of the time, and that is a very important function in data-rich environments. Otherwise, we will not know what to throw out without fear, and the databases will continuously become larger and more unusable.

Moving on to the next points, “Repeatable” and “Expandable” are somewhat related. Let’s say a marketer has been using a very innovative selection logic that she came across almost by accident. In pursuing special types of wealthy people, she stumbled upon a piece of data called “owner of swimming pool.” Now, she may have even had a few good runs with it, too. But eventually, that success will lead to the question of:

1. Having to repeat that success again and again; and
2. Having to expand that universe, when the “known” universe of swimming pool owners become depleted or saturated.

Ah, the chagrin of a one-hit-wonder begins.

Use of statistical models, with help of multiple variables and scalable scoring, would avoid all of those issues. You want to expand the prospect universe? No trouble. Just dial down the scores on the scale a little further. We can even measure the risk of reaching into the lower-scoring groups. And you don’t have to worry about coverage issues related to a few variables, as those won’t be the only ones in the model. Want to automate the selection process? No problem there, as using a score, which is a summary of key predictors, is far simpler than having to carry a long list of data variables into any automated system.

Now, that leads to the next point, “Filling in the gaps and summarizing the complex data into an easy-to-use format.” In the age of ubiquitous and “Big” data, this is the single-most important point, way beyond the previous examples for traditional 1-to-1 marketing applications. We are definitely going through massive data overloads everywhere, and someone better refine the data and provide some usable answers.

As I mentioned earlier, we build models because we will never know the whole truth. I believe that the Big Data movement should be all about:

1. Filtering the noise from valuable information; and
2. Filling the gaps.

“Gaps,” you say? Believe me, there are plenty of gaps in any dataset, big or small.

When information continues to get piled on, the resultant database may look big. And they are physically large. But in marketing, as I repeatedly emphasized in my previous columns, the data must be realigned to “buyer-centric” formats, with every data point describing each individual, as marketing is all about people.

Sure, you may have tons of mobile phone-related data. In fact, it could be quite huge in size. But let me turn that upside down for you (more like sideways-up, in practice). Now, try to describe everyone in your footprint in terms of certain activities. Say, “every smart phone owner who used more than 80 percent of his or her monthly data allowance on the average for the past 12 months, regardless of the carrier.” Hey, don’t blame me for asking these questions just because it’s inconvenient for data handlers to answer them. Some marketers would certainly benefit from information like that, and no one cares about just bits and pieces of data, other than for some interesting tidbits at a party.

Here’s the main trouble when you start asking buyer-related questions like that. Once we try to look at the world from the “buyer-centric” point of view, we will realize there are tons of missing data (i.e., a whole bunch of people with not much information). It may be that you will never get this kind of data from all carriers. Maybe not everyone is tracked this way. In terms of individuals, you may end up with less than 10 percent in the database with mobile information attached to them. In fact, many interesting variables may have less than 1 percent coverage. Holes are everywhere in so-called Big Data.

Models can fill in those blanks for you. For all those data compilers who sell age and income data for every household in the country, do you believe that they really “know” everyone’s age and income? A good majority of the information is based on carefully constructed models. And there is nothing wrong with that.

If you don’t get to “know” something, we can get to a “likelihood” score—of “being like” that something. And in that world, every measurement is on a scale, with no missing values. For example, the higher the score of a model built for a telecommunication company, the more likely that the prospect is going to use a high-speed data plan, or the international long distance services, depending on the purpose of the model. Or the more likely the person will buy sports packages via cable or satellite. Or the person is more likely to subscribe to premium movie channels. Etc., etc. With scores like these, a marketer can initiate the conversation with—not just talking to—a particular prospect with customized product packages in his hand.

And that leads us to the final point in all this, “Staying relevant to your customers and prospects.” That is what Big Data should be all about—at least for us marketers. We know plenty about a lot of people. And they are asking us why we are still so random about marketing messages. With all these data that are literally floating around, marketers can do so much better. But not without statistical models that fill in the gaps and turn pieces of data into marketing-ready answers.

So, why model? Because a big pile of information doesn’t provide answers on its own, and that pile has more holes than Swiss cheese if you look closely. That’s my final answer.