Today at GigaOm’s BigData conference in NYC the experts weighed in on data forecasting, data prediction, and storage of massive amounts of data. The opening conclusion was that fast data is better than big data, though Jeff Jonas, Distinguished Engineer of IBM argued that keeping large amounts of raw data and letting technology catch up is best because with a big data sample results can be more accurate and have lower false positives and lower false negatives.
So why or should we care? Because everything we do is tracked online, everything we do is tracked via mobile, and companies already have stored these valuable data points. Do you know what they are tracking and actively pulling? Is there value in all of these clicks, cookies and datapoints? There sure is and the market hasn’t even started infancy.
Bill McColl, Founder and CEO of Cloudscale, said he sees the shift and future with commercial data and companies understanding how to leverage the data they have. Jason Hoffman, Founder of Joyent, added the most generic economic good we have are bits and Jeff Jonas of IBM said data is like diarrhea. What?
Hilary Mason, Chief Scientist at Bit.ly gave her new media perspective on all the different data they pull at Bit.ly the link shortner site and boiled it down to two sets of data 1)structure data: user, date, time, clicks and 2)unstructured data: metadata, content, changing fields. Her emphasis was on speed, pulling data quickly and reacting quickly “We don’t call it real-time, we call it relatively recent,” she quoted her team and approach to getting results and balancing volume with latency.
Overall it seems corporations are implementing different strategies in regards to their what to do with their data but nonetheless its scary to think that everyone in the industry is still figuring out how to store data and what to do with it. Its like keeping a loaded gun in the basement: doesn’t protect you from a break in, can cause serious harm if there is a accident.
The future of data is seemingly focused on predictive technologies which requires results in short amounts of time which will make those who creatively store data and figure out how to access it faster than others. Its pretty scary corporations are racing to take your data and completely figure you out, no matter how big or small your dataset is.