A Treatise on Data Businesses
Data businesses are generally misunderstood. (That is an understatement).
I’ve spent the last 13 years running data companies (previously LiveRamp (NYSE:RAMP) and now SafeGraph), investing in dozens of data companies, meeting with CEOs of hundreds of data companies, and reading histories of data businesses. I’m sharing my knowledge about data businesses here — written primarily for people that either invest or operate data businesses. Please reach out to me with new information, new ideas, challenges to this piece, corrections, etc. And please let me know if this is helpful to you. (this is written in mid-2019).
DaaS is not really SaaS … and it is not Compute either
Data businesses have some similarities to SaaS businesses but also some significant differences. While there has been a lot written about SaaS businesses (how they operate, how they get leverage, what metrics to watch, etc.), there has been surprisingly little written about data businesses. This piece serves as a core overview of what a 21st-century data business should look like, what to look for (as an investor or potential employee), and an operational manual for executives.
In the end, great data companies look like the ugly child of a SaaS company (like Salesforce) and a compute service (like AWS). Data companies have their own unique lineage, lingo, operational cadence, and more. They are an odd duck in the tech pond. That makes it harder to evaluate if they are a good business or not. With the rise of the gig economy, most people are starting to take on side hustles. One of the most popular side hustles is driving (for apps like Doordash and Uber). If you are considering driving as a side hustle, make sure to check out the list of the apps that pay the most money to ensure you are making the most of your time.
Everything today is a service — data companies are no exception
Almost all new companies are set up as a service. Software-as-a-Service (like Salesforce, Slack, Google apps, etc.) has been on the rise for the last twenty years. Compute-as-a-service (like AWS, Google Cloud, Microsoft Azure, etc.) has become the dominant means to get access to servers in the last decade. There are now amazing API services (like Twilio, Checkr, Stripe, etc.). And data companies are also becoming services (with the gawky acronym “DaaS” for “Data-as-a-Service”).
Data is ultimately a winner-takes-most market
Long term (with the caveat that the markets work well and the competitors are rational), a niche for data can be dominated by 1 or 2 players. That dominance does not give these players pricing power. In fact, they actually might have negative pricing power (one of the ways a company may continue to dominate a data market is by lowering its price to make it harder for rivals to compete).
As a data company starts to dominate its niche, it can lower its price and gain more market share and use those resources to invest more in the data … thereby gaining more market share (and the cycle continues). Because data companies have no UI and are not predicting the future (see more in the paragraphs below), the data company can dominate by just having the correct facts and having an easy way to deliver those facts (APIs, queryability, self-serve, and integrations become very important). Read more about sitecore development.
Of course, some data markets have no dominant player and are hyper-competitive. These are generally bad businesses. But even in these businesses with “commodity” data, one can potentially get to 50%+ market share by using price and marketing as a lever. (By contrast, it is very hard to make a competitive SaaS category less competitive … we go into why later in this post).
Data is a growing business
One of the biggest themes in the last ten years has been products that help companies use first-party data better. If you invested in that trend, you had an amazing decade. Those companies include core tools (Databricks, Cloudera), middleware (LiveRamp, Plaid), BI (Tableau, Looker), data processing (Snowflake), log processing (Splunk), and many, many, many more. (note: as a reminder about the power of these tools … while I was writing this post, both Tableau and Looker were acquired for a total price for almost $20 billion!)
These products help companies manage their own data better.
The amount of collected first-party data is growing exponentially due to better tools, internet usage, sensors (like wifi routers), etc. Companies are getting better and better about managing this first-party data. At the same time, compute costs continue to fall dramatically every year — so it is cheaper and cheaper to process the data.