The velocity of data is increasing and will always increase. Therefore, the need for data literacy is increasing and will always increase. All other things being equal, there is a significant difference between being literate and being fluent. To create value for your business, you need to think strategically about sources and uses of data and understand how data is turned into action. Let’s review.
You need to be data literate, not data fluent
All other things being equal, there is a significant difference between being literate and being fluent. A literate executive will recognize the need for a desired outcome. For example, “We need to understand how these sales numbers are related to these categories.” They might even know that a simple linear regression is the proper mathematical technique to accomplish the task. But most importantly, being data literate would enable this executive to ask the right questions and seek the right assistance to accomplish the goal.
Someone fluent in mathematics, statistics or data science would look at the problem quite differently. They would quickly and clearly understand that a simple linear regression was the appropriate technique to accomplish the task, but someone fluent would consider specific methodologies in ways that literate executives might not. For example, “Is the least squares approach sufficient to achieve the desired outcome for this linear regression or will a least absolute deviations approach yield a better outcome?”
To put it another way, you don’t need to speak French to recognize that the email you just received is written in French. You just need to be literate to the point where you know that Google Translate is not going to get the job done and you need a highly skilled French translator to help you interpret and respond to the communication.
Data literacy basics
Before you can turn data into action, you must collect it. Then, you can do three basic things: transform, learn and predict. In practice, you don’t need to know how to do any of this; you just need to know exactly what each step is and what it is for.
In short, to transform data is to make it ready to learn from. Transforming data includes everything from data munging or data wrangling (the cleaning up of data) to entity extraction (identifying key terms in unstructured data that have value) to true feature extraction (building derived values from existing data).
At this stage you may also enrich your data with additional data (data is more powerful in the presence of other data). And finally, you can use some aggregation techniques (weighted averages, median, Gaussian distribution and standard deviation and so on) to summarize your data.
Once transformed, it’s time to learn. Here you can use techniques such as regression (a common way to predict the future based on the past by exploring spatial relationships), clustering (it’s just what it sounds like) and classification (algorithms and other techniques used to identify to what category or subpopulation a data point belongs).
After you’ve learned from your data, you can craft predictive models that simulate the future, or use what you’ve learned to make optimal selections from a set of alternatives.
That’s it. That’s what you can do with data to turn it into action.
Get in the game
Injecting yourself into the process is the first step. You should keep up with tweets that are hashtagged #datascience, #AI, and #machinelearning.