This evening, I will be moderating a panel on data mining and predictive analytics. In preparation for this I have doing a bit of research and wanted to share a few observations that will anchor the agenda for the talk.
We are inundated with data. It is driven the rapid changes in how we interact, communicate and function in a world that increasingly connected (both people and things) with the Internet. The amount of data being collected is growing at an explosive rate. In 2010, data collected was equivalent to the cumulative of data generated in the previous 5000 years. And the size of this data store is doubling every 2 years.
Consider the following:
- Facebook has at least 30Petabytes of Data, 30B pieces of content shared/month
- YouTube users upload 35 hours of video every minute
- Twitter delivered 25B tweets in 2010
- Zynga generates15 Terabytes of data every day
None of these companies mentioned above existed 7 years ago.
So we have companies that have far more data than ever before. How does this create a new paradigm? What are the ways in which commerce will be transformed? We’ve had companies exploiting large data sets with tools and techniques that have been around for decades- so what’s different now? Are we just dealing with more noise?
These are the topics that I hope we will gain some insight on in tonight’s discussion.
To get the discussion started tomorrow, I am highlighting three changes that I see.
#1) Data Availability-Exponential growth in data is available to everyone- companies and individuals alike. Whereas massive data sets were once the domain of well capitalized companies, now even startups and individuals can use them to find value.
#2) Tools-Data storage and processing infrastructure and tools are becoming increasingly powerful at fractions of the cost. Open source software, collaboration platforms, cloud infrastructure provide mechanisms for not only collecting and storing the data, but also processing it at scale.
#3) Investment-Markets are rewarding companies on the promise that they will find value in the massive data sets that they create. Have you checked out the valuation of Facebook, Twitter, or Zynga lately? To be sure, these companies are achieving astonishing growth rates, but investors are placing their bets on new ways of monetizing the data streams that these and other companies collect and control.
I would be very interested in getting my readers’ ideas on what to ask our panelists. I will try and incorporate suggestions into the flow of the discussion. For those of you who will not be able to make the session I will be blogging on the key takeaways and insights in the days ahead.
Looking forward to seeing many of you this evening.
Last week, Marc Andreessen published an essay in the Wall Street Journal entitled “Why Software is Eating the World”. A particular passage resonated with me:
‘Companies in every industry need to assume that a software revolution is coming. …new software ideas will result in the rise of new Silicon Valley-style start-ups that invade existing industries with impunity. Over the next 10 years, the battles between incumbents and software-powered insurgents will be epic.”
Since we are all going to be software companies (eat or be eaten), I think it is important to understand how/where value creation is shifting within the software industry itself. In short it is a tale of migration from features and functions to data – a trend which many of the enterprise software companies I work with see and understand, but are slow to adapt to- primarily because it is a trend highly disruptive to their existing business models.
To examine this shift, let’s look at how software has evolved over time.
Software as Tools
The first business applications were ones that helped individuals do tasks more effectively. Remember Lotus 1-2-3? Wordstar? In short, for roughly the first decade of the PC era, software was primarily a tool that you used to make some individual function/task more efficient to complete. Software was an interface that allowed us to leverage the rapid advancements in computing power and get the work all of us were doing anyway, get done faster.
Software as a Repository of Best Practice
From point solutions that streamlined highly generic tasks, vendors began to create value (and charge for this value via licensing and services) by embedding specific business process knowledge into applications. People could move down the learning curve more quickly and be managed more effectively by having them operate in a well defined and highly customized software environment. Software became a way of capturing and propagating best practice within an enterprise and something that became critical in very specific functional areas of enterprises. CRM was an early example of this, integrated ERP (HR, financial accounting processes, inventory management, payroll) systems followed quickly thereafter. The economics of SaaS/PaaS are generating a proliferation of these models across every industry vertical/ functional process imaginable. All of these applications create standard work process and control mechanisms that drive productivity and consistency.
Software as a Communication Medium
As the various parts of a business process started to be connected, and common standards for connectivity (e.g., XML) evolved, communication/collaboration has become a central function integrated into applications. Indeed, communication has given rise to a new value creation mechanism for software- transaction platforms. Ebay? Paypal? Skype? They didn’t make money by selling software licenses- rather they made platforms for communication, collaboration and validation that allowed them to make money on the transactions that they brokered.
Software as a Data Collection Mechanism
So now we arrive at data.
Is Zynga a game or a data collection mechanism? Google search engine or data-based advertising platform? Facebook communication tool or targeted marketing platform?
We now have access to literally millions of useful applications at little to no cost. To be sure it is cheaper to create them, but firms are finding new ways to offer subsidized or free software because of the data they hope to compile through widespread distribution of their products. Software has become a data collection mechanism and analytic competitors are hoarding data and learning how to make these data streams useful to refine their own businesses and create value for others.
One thing is clear. Across all of the disruptive models that have dominated “bubble 2.0”, none have involved licensing fees. Indeed, the primary source of value that Mr. “no bubble here” Andreessen is so confident in is data. The extent to which his investments will pan out will depend on whether they will be able to meaningfully realize value in the petabytes that his firms control.
What it all means?
If you are building a software product, you need to incorporate all of the means of generating value that I reference above. Doing so not only maximizes value for your users, but provides you with flexibility to morph your business model for the future.
As you build, expect that at some point soon you will face someone willing to be highly disruptive that will be seeking to generate profits through business models that are vastly different to your own.
Recognize that value is shifting towards data and that to win, you had better become great at collecting, managing, analyzing, and MONETIZING all of the data streams that you control.
So if you are in a business that charges fees for software licenses, or a platform that makes fees on transactions and have no vision or plan for how you will monetize the data you control, welcome to a decade of pain. The “Silicon Valley-style” startups that Mr. Andreessen is funding are coming to eat your world.
Stay tuned for more on data and analytics- if you are in/around San Diego, and interested in the topic, be sure to check out an event I am moderating on 9/13.
I am pleased to announce that I will be moderating a panel of academics and executives for a discussion on data mining and predictive analytics in September. Data are such a critical ingredient in the future development and defense of profitable business models (on-line or off) and my aim for the talk will be to show how leaders across many industries are adapting and innovating in response to this trend.
Thus far we have confirmed the following speakers:
Dr. Stephen Coggeshall, CTO, ID Analytics
Dr. Elea Feit, Research Director for Wharton’s Customer Analytics Initiative (WCAI)
Scott Gnau, President, Teradata Labs
Dr. Christopher Trepel, Senior Vice President Corporate Affairs and Chief Scientific Officer, Encore Capital Group
I will be posting insights and thoughts from the talk on my blog, but if you are in San Diego on 9/13 and interested in how data are transforming the competitive dynamics of multiple industries, you are most welcome to attend. For event details and registration, click here.
Hope to see you there.