This evening, I will be moderating a panel on data mining and predictive analytics. In preparation for this I have doing a bit of research and wanted to share a few observations that will anchor the agenda for the talk.
We are inundated with data. It is driven the rapid changes in how we interact, communicate and function in a world that increasingly connected (both people and things) with the Internet. The amount of data being collected is growing at an explosive rate. In 2010, data collected was equivalent to the cumulative of data generated in the previous 5000 years. And the size of this data store is doubling every 2 years.
Consider the following:
- Facebook has at least 30Petabytes of Data, 30B pieces of content shared/month
- YouTube users upload 35 hours of video every minute
- Twitter delivered 25B tweets in 2010
- Zynga generates15 Terabytes of data every day
None of these companies mentioned above existed 7 years ago.
So we have companies that have far more data than ever before. How does this create a new paradigm? What are the ways in which commerce will be transformed? We’ve had companies exploiting large data sets with tools and techniques that have been around for decades- so what’s different now? Are we just dealing with more noise?
These are the topics that I hope we will gain some insight on in tonight’s discussion.
To get the discussion started tomorrow, I am highlighting three changes that I see.
#1) Data Availability-Exponential growth in data is available to everyone- companies and individuals alike. Whereas massive data sets were once the domain of well capitalized companies, now even startups and individuals can use them to find value.
#2) Tools-Data storage and processing infrastructure and tools are becoming increasingly powerful at fractions of the cost. Open source software, collaboration platforms, cloud infrastructure provide mechanisms for not only collecting and storing the data, but also processing it at scale.
#3) Investment-Markets are rewarding companies on the promise that they will find value in the massive data sets that they create. Have you checked out the valuation of Facebook, Twitter, or Zynga lately? To be sure, these companies are achieving astonishing growth rates, but investors are placing their bets on new ways of monetizing the data streams that these and other companies collect and control.
I would be very interested in getting my readers’ ideas on what to ask our panelists. I will try and incorporate suggestions into the flow of the discussion. For those of you who will not be able to make the session I will be blogging on the key takeaways and insights in the days ahead.
Looking forward to seeing many of you this evening.