Big data is one of technology’s buzziest trends with companies and investors jumping on the bandwagon with bold proclamations that they’re big data. With the addition of tech stack data on programming languages, frameworks, and databases now on CB Insights, we took a look at this data to see who is truly big data by seeing who is using the Hadoop framework. Note: We’d previously looked at the tech stack of billion dollar companies which may also be of interest.
For the uninitiated, Hadoop is an open source framework that allows for distributed processing and analytics of immense data sets across clusters of servers. And if funding dollars are any indication, investors are committed to accelerating the adoption of Hadoop. Last quarter, commercial Hadoop vendor Cloudera raised $900M in a single round of financing from Intel Capital, Google Ventures, T. Rowe Price and Michael Dell’s MSD Capital.
- What industries are most likely to use Hadoop?
- Where the companies are located?
- How mature are companies from a financing perspective who use Hadoop?
- The actual list of companies
BI, Ad and Marketing Tech Lead
The Hadoop platform is used by companies in a range of markets. As Cloudera Chief Strategy Officer Mike Olson explained in a past interview:
“In finance, if you want to do accurate portfolio evaluation and risk analysis, you can build sophisticated models that are hard to jam into a database engine. But Hadoop can handle it. In online retail, if you want to deliver better search answers to your customers so they’re more likely to buy the thing you show them, that sort of problem is well addressed by the platform Google built. Those are just a few examples.”
Of the 347 companies analyzed, Business Intelligence, Analytics & Performance Mgmt saw the highest number of companies using Hadoop. The top 3 is rounded out by by two ad tech related areas — Advertising, Sales and Marketing Tech and Advertising Networks & Exchanges. Below are the top 10 industries where companies are using Hadoop based on CB Insights company tech stack data.
Perhaps unsurprisingly, given the number of Hadoop vendors in the state including Cloudera, Hortonworks and MapR, and the state’s general dominance of VC, California has the most VC-backed companies using Hadoop to manage and analyze their data, with 61% of identified companies headquartered there. In line with venture capital market trends, 12% of the companies are based in New York while another 6% of the companies are based in Massachusetts.
44% of VC-backed companies using Hadoop are at the mid-stage
Having a large and quickly scaling set of unstructured or semi-structured data goes hand in hand with the usage of Hadoop technology. Interestingly, of the venture-backed companies analyzed 44% last raised funding at the mid-stage (Series B/Series C). Companies in this category include Quixey, Dataminr and Datasift. Another 31% are late-stage (Series D+) companies eg. Palantir, Spotify.
List of Companies
Below are 20 of the venture-backed companies who using Hadoop. The full list of companies can be found on the ‘Research’ tab, after logging in to CB Insights. This research is only available to subscribers with access to CB Insider which requires a Group level subscription or higher.
- Sift Science
- Practice Fusion
- Lithium Technologies
We will continue to monitor and analyze our tech stack data, and if you’d like to to stay abreast of additional tech stack analysis, sign up for our newsletter. Subscribers with access to CB Insights on the Division plan can check out the “Tech Stack” tab on company profiles for information ranging from programming languages to operating systems to software applications used by today’s high-growth private technology companies.