Companies at the Various Levels of the Big Data Stack

Revenue figures included below are accurate for 2014 or 2015, when the Coursera content for the course this note is based on was originally generated.

This post presents a very simplified model of the big data stack, and some examples of the companies that operate on each level of the stack. For purposes of this post, the stack consists of three levels:

  • Desktop Commercial Software
  • Databases and Enterprise Software
  • Computers and Routers (Hardware)

All of these companies share a common goal: they are trying to rebuild themselves as offering just-in-time and real-time data to enable humans or computers to make data-driven decisions. This means most of them are investing extensively in predictive analytics and machine learning projects. Examples include IBM Watson and Microsoft Azure.

The major players for each level are discussed below.

Computers and Routers (Hardware)

This part of the stack consists of hardware, chips, monitors and other physical components assembled to make computers, routers, and servers.

  • Apple ($224B), Intel ($55B), Cisco ($49B), Texas Instruments ($13B), AMD ($4.6B)

Apple is unique among the companies listed because it actually operates at all levels of the data stack, from the chips and components to operating systems and applications all the way to value-added services like Apple Music and iCloud. The business model they have has been enormously successful as they are largest IT company in the world.

Databases and Enterprise Software

  • IBM ($86B), Oracle ($38B), SAP ($21B), Symantec ($6.7B), VMware ($6B), Salesforce ($4.1B), Teradata ($2.7B)

This part of the stack focuses on enterprise, relational and distributed databases, large enterprise software applications, and scalable operating systems. Generally, only professionals interact with this software, and it is less user-friendly than software designed for consumers.

IBM is primarily focused on building enterprise systems for companies. Oracle focuses more on large databases that handle high traffic, and for software for customer relationship management, finance, human resources, and supply chain management. Oracle has slowly dominated its particular segment. One of its last remaining competitors is Teradata, also listed above. Salesforce is a hybrid company that offers complex enterprise software to manage customer relationships, but offers it as a hosted service. Symantec makes enterprise security software, and VMware focuses on software that enables multiple operating systems and users on a single computer.

Desktop and Commercial Software

  • Microsoft ($86B), Adobe Photoshop ($4.35B), SAS ($4B), Matlab ($750M), Tableau ($468M)

This software is intended for general consumers, unlike the enterprise software discussed above, so it is more user-friendly. The typical audience are still knowledge workers, however, and the software applications enable them to access and work with company data. The best examples of this software is the Microsoft Office suite, included Word, Excel, and Powerpoint. SAS and Matlab are more complex statistics and modeling packages. Tableau is focused on user-friendly data manipulation and visualization.

This content is taken from my notes on the Coursera course “Business Metrics for Data-Driven Companies.” It is sponsored by Duke University and the course content is presented by Professor Daniel Egger.