While proprietary software providers like IBM and Oracle (among many others) once dominated the technology scene, open-source has since transformed how software is built and implemented.
When it comes to software development, the term “open-source” is almost synonymous.
And as the most collaborative method of software development, it has grown increasingly popular over the last two decades. In fact, the open-source services industry is set to exceed $17B in 2019, and expected to reach nearly $33B by 2022, according to CB Insights’ Market Sizing tool.
Among its many characteristics, open-source software is publicly accessible, which allows developers to exchange code & ideas in a transparent and collaborative fashion. It also enables flexibility for many businesses looking to solve a wide range of problems.
Today, over 30M developers contribute to community-based platforms like GitHub. And the broader market is estimated to be worth hundreds of billions based on recent big-ticket acquisitions like Red Hat (acquired by IBM for $34B) and GitHub (acquired by Microsoft for $7.5B) — as well as large public market valuations like those of MongoDB ($7.9B) and Elastic ($7.3B).
In this analysis, we discuss the various elements of open-source, including what the software is, who the major players are (as well as who they’re benefiting — hint: cloud providers), and what the future has in store.
TABLE OF CONTENTS
- What is open-source software?
- How does open-source software make money?
- Who are the developers of open-source software?
- What are the rising challenges in open-source?
- Where is the open-source software landscape headed?
What is open-source software?
A brief history
Collaborative software development dates back to the 1940s when binary, programmable computers were only available to a privileged few. And it wasn’t until the 1960s that modifying and redistributing the source code, or the underlying human-readable, code-base, became an established practice.
With the rise of bundled software (proprietary software packaged with proprietary hardware) came the ability for users to troubleshoot and modify source code themselves — this practice was even encouraged by many manufacturers to limit the need for frequent, onsite visits.
As early adopters of computer software and hardware, universities began sharing bug fixes, and even software enhancements, with other universities. Soon thereafter, universities began sharing their own, free-to-use software with the public (referred to as public-domain software). This trend was bolstered by the expansion of ARPANET and CSNET throughout the 1970s and 80s, which allowed for electronic communication between academic institutions.
But with the introduction of the consumer internet in the mid-1990s, everyone — from academic institutions to teenage consumers — could now communicate and access information electronically. This brought about new opportunities for developers to collaborate and share their work with others.
While organizations like Microsoft, IBM, Oracle (then Software Development Laboratories), SAP, SAS (Statistical Analysis System), and Xerox continued to dominate the software scene, open-source began to offer alternatives to these proprietary solutions.
How it works
Open-source software (OSS) is any type of program or application that developers can inspect, copy, modify, and redistribute. This type of software is also referred to as free, open-source software (FOSS).
To be considered open-source, the source code of the program — the underlying code that makes up the design, functionality, and defining attributes of an application — must be publicly accessible. Those with access to the source code can then study it, “fork” (copy) it, change it, and share the modified version with others.
For example, any business might use the collaborative software to improve its own.
- First, developers will look into a project that promises some type of application improvement or technical solution.
- Then, they will further study the project’s source code to ensure compatibility, security, and compliance.
- After that, they will copy, or fork, the source code before modifying it to meet their specific needs.
- Finally, developers will implement the modified source code into the business’s proprietary software.
They may choose to share the modified source code with the public or keep the modifications for themselves. It depends on how the source code is being used by the business and the licensing terms of the original project.
These software projects vary in size and scope, depending on the purpose of the initiative and the number of contributing developers. Thousands of developers may contribute to a large-scale database project designed for the global enterprise, while just a handful of developers may contribute to a smaller initiative, like DIY smart home automations designed for fellow smart home enthusiasts.
Some of the most notable projects since the dot-com bubble include Mozilla’s Firefox (an open-source internet browser developed at Netscape) in 2002 as well as Git (a source code version-control system created by Linus Torvalds) in 2005.
Open-source licenses (and why they matter)
As open-source software scales, licenses are used to dictate how the software can be used.
Each license outlines different policies for modifying and redistributing source code. Some licenses are more restrictive, in that developers are required to share their changes, while others are more liberal, in that developers do not have to redistribute their modifications.
Choosing a license is an important part of establishing a project. While the differences may seem trivial, the choice of license has long-term implications for the project and its community.
Software projects have the freedom to publish under their own bespoke license. However, it is far more common to publish using a license that is already approved by the Open-Source Initiative’s (OSI) and popular within the community.
There are two primary OSS license types: permissive and copyleft. Permissive licenses are more liberal, in that developers are not required to share their modified code, while copyleft is more restrictive, in that developers must open-source all future iterations of the original project.
The most popular permissive licenses are the MIT license and the Apache 2.0 license.
The MIT license is simple to understand, and many developers will typically choose a project with this license (all else equal). But the terms in the Apache 2.0 license are more detailed than those outlined in the MIT license, and for this reason, the Apache 2.0 license is popular with large open-source projects — those with hundreds (or thousands) of contributors and typically designed for enterprise-scale deployment.
Popular Apache 2.0 projects include Docker, Kubernetes, Swift, and TensorFlow.
On the opposite end of the spectrum is the GNU General Public License (GNU GPL), which is one of the most popular copyleft licenses. Though certain organizations avoid using GNU GPL software due to the copyleft requirement, it is considered one of the most equitable licenses, in that users are required to open-source all of their modifications.
However, the use of GNU GPL has declined in recent years as permissive licenses have become the norm. One of the most popular databases today, MongoDB, switched from GNU GPL to a proprietary license in October 2018.
While there are dozens of other open-source software licenses, developers gravitate towards a select few. This helps with consistency, understanding, and community building. If an experienced developer comes across a project with an Apache 2.0 license, for example, they have a general understanding of how the software can and cannot be used. As a result, they may be more inclined to use it.
However, it’s important to also consider how open-source software licenses facilitate or inhibit future monetization opportunities. While adoption growth is a key metric early on, monetization becomes necessary to support projects and user needs.
How does open-source software make money?
Most open-source software projects don’t start with the goal of monetization.
Instead, they look to provide solutions to problems. Sometimes these problems are small and experienced by few developers. Other times these problems are much larger.
When a project provides a solution to a big problem, demand for that project grows. And as that project scales, a revenue-generating business is established to support its growth.
Typically, these revenue-generating businesses are started by the founders of the original project to provide enterprise support to large organizations adopting the software.
While it may seem counterintuitive that a business would be willing to pay for software that is otherwise free, it’s typically enterprises that request this type of service.
Enterprises want assurances. They want security flaws fixed, dedicated assistance, and software longevity. They are not willing to implement open-source software that has persistent vulnerabilities, that complicates development, or that may become obsolete.
And they understand that a regular stream of revenue to these projects can help provide these assurances. This model of “commercial support” is one of the most established methods of monetization, though there are many others.
Open-source monetization strategies vary from company to company and change as an organization matures.
An increasingly popular model is referred to as “open-core,” which offers a blend of open-source and proprietary software. The core platform remains free and open-source, though feature limited. Users can then choose to pay for add-on services or to unlock a proprietary, feature-rich platform.
Software that is traditionally proprietary may adopt an open-core model to build open-source community awareness, while software that is traditionally open-source may adopt open-core to capitalize on emerging monetization opportunities. Examples of open-core companies include Docker, Elastic, GitLab, MongoDB, and Redis.
It is important, however, to note that there are different variations of “openness” within the open-core model. Open-core organizations can choose to become more or less open, depending on their strategic business interests. Striking the right balance between proprietary and open-source is the greatest challenge of the open-core model.
Source: Joseph Jacks
Another type of monetization strategy that has emerged in recent years is that of the corporate-sponsored project. These corporate-owned open-source software projects are typically used as a business development resource rather than a means of direct monetization.
In these instances, the open-source software isn’t the primary offering of the company. Rather, like the open-core model, these corporate-sponsored software projects bring awareness to proprietary products and services offered by the organization.
For example, Google is the primary developer of Kubernetes. Kubernetes is an open-source software platform that automates deployment, scaling, and management of containerized applications.
Source: Ruben Orduz
Containerized applications are made up of numerous containers which store individual microservices. They provide everything required to run the microservice independently (libraries, configuration files, etc.), yet they are lightweight, which makes them less resource-intensive and more cost-efficient.
Microservices operate as smaller, individual services (snippets of code) that connect together to form a comprehensive application. For example, a retailer’s e-commerce app may feature a variety of microservices — one for the login authentication, another for the store locator service, etc.
As a container orchestration tool, Kubernetes is able to manage the deployment and operations of hundreds (even thousands) of containers. Amazon engineers deploy code every 11.7 seconds whereas Netflix engineers deploy code thousands of times per day, according to New Relic — all possible thanks to microservices deployed via containers (and continuous deployment practices).
While Google doesn’t monetize Kubernetes directly, the wide adoption of the service has brought awareness to the company’s cloud service, Google Cloud Platform (GCP).
Within the increasingly competitive cloud services market, building developer trust with open-source initiatives provides valuable differentiation.
Who are the developers of open-source software?
Though many of the largest open-source software projects (and inevitably multi-billion dollar businesses) are built by individual developers from the ground up, many of the newest, most successful projects are products of the world’s largest tech companies.
And while millions of independent developers contribute to GitHub-hosted projects each year, contributions from corporate employees are also on the rise.
Corporate employees are contributing to both independent projects as well as projects incubated internally at their respective organizations.
Microsoft, Google, Intel, and Facebook — none of which are open-source companies — are actively contributing to various projects on GitHub. Microsoft employees were the most prolific GitHub contributors in 2018, with approximately 7,700 unique contributions (and 4,550 total contributors in 2017).
GitHub was acquired by Microsoft back in June 2018 for $7.5B. At the time, it was the largest enterprise software acquisition in history.
Number of unique contributions made by employees. Source: GitHub
Google employees were also active on GitHub, making 5,500 collective contributions in 2018. Many of these contributions helped to improve smaller, independent projects, though a majority supported Google’s own open-source software projects like Kubernetes, Istio, and Knative.
Another one of Google’s most successful open-source projects in recent years is machine learning (ML) library TensorFlow, which is the most popular ML library available today. Its widespread use has created a large, engaged community, resulting in contributions from many independent developers.
And as corporate-sponsored projects become more popular, independent developers will continue to contribute.
For example, Microsoft’s Visual Studio Code project has over 19,000 contributors in total. It is the most popular GitHub project by a significant margin. With 7,700 total contributions in 2018, Microsoft’s employee contributions to the project are in the minority.
But this is the case with many of the most popular open-source projects on GitHub: 8 of the 10 most popular GitHub projects are products of big tech companies like Microsoft, Facebook, Google, and IBM (Ansible). However, only a fraction of project contributions come from their respective employees.
And while these projects are free to the public and not monetized directly, they are incredibly valuable assets to the sponsoring company.
With thousands of developers contributing, these tech giants benefit from the free developer input and direct user feedback. This allows organizations to build better software, faster.
As mentioned earlier, these projects also act as ongoing lead generation for the sponsoring organization.
In April 2017, GitHub counted nearly 20M users and 57M repositories — which store source code, changes to the source code, and a history of those changes. At the time of Microsoft’s acquisition, GitHub counted 28M users and 85M repositories. And at the end of 2018, GitHub had 31M+ users and 96M+ repositories.
As the number of contributors and total repositories grows, these tech giants are better able to identify future enterprise customers.
For example, Amazon, Microsoft, or Google might look at the contributors of their respective open-source projects for future sources of cloud revenue — today’s engaged open-source contributors may become tomorrow’s cloud customers.
What are the rising challenges in open-source?
Cloud giants benefit immensely from the other popular projects hosted on GitHub — and as a result, independent, open-source software providers have become increasingly guarded as cloud providers reap the benefits of their contributions.
In recent years, cloud providers have copied the source code of popular projects, made minimal changes (if any), rebranded the software, and offered it to customers as a proprietary service.
Open-source software services using permissive licenses, such as the MIT and Apache 2.0 licenses, have been particularly vulnerable.
To combat these threats, many of these popular services are adopting new licenses that limit abuse from commercial service providers but allow for continued virality.
In August 2018, Redis Labs, a popular database management systems company, added the Commons Clause to its permissive license, Apache 2.0. Redis itself remained open-source, but certain aspects of the company would remain off-limits to those who monetized its services without contributing.
The Commons Clause is a 130-word rider for popular licenses to prevent other commercial service providers from selling the software itself. However, it does allow organizations to use and build with the licensed software.
Source: The Commons Clause
Software using the Commons Clause is referred to as “source-available”. While less liberal than permissive licenses, source-available software can provide more commercial freedom than copyleft open-source licenses, such as GPL and AGPL.
However, in February 2019, Redis dropped the Commons Clause and replaced it with a new source-available license creatively named Redis Source Available License (RSAL). The company determined that a bespoke license was most suitable for the company and its future well-being.
Open-source companies MongoDB, a popular NoSQL database platform, and Confluent, a service designed for processing streams of data, also moved to bespoke licenses of their own.
Source: Scale Venture Partners
However, these changes remain controversial to both the community and commercial service providers — especially as software projects adopt different licenses for different parts of their software.
In certain cases, some aspects of the software may remain fully open-source, while some may become proprietary. In addition, these open-source and proprietary aspects can be mixed together in the same code base.
Independent and corporate developers become confused as to which services can and cannot be modified. Developers don’t want to breach the terms of a license because it lacks clarity.
For example, Elastic, a popular developer search engine, has 3 tiers of licenses: free open-source, free proprietary, and paid proprietary. However, the source codes of the free open-source and free proprietary tiers often intertwine.
As a result, AWS announced a new distribution of Elasticsearch (Elastic’s main product) after AWS customers continued to experience frustrations. The CEO of Elastic quickly fired back, explaining that the company was unphased by the move and that the original Elasticsearch would prevail over this new AWS distribution.
Even so, estimates suggest that AWS generated over $100M in 2018 revenue from just the top 100 customers of its Amazon Elasticsearch Service.
It’s hard to say what Elastic will do next, but a license change may not be out of the question. Despite the company’s commitment to open-source, there is only so much an organization can tolerate before it needs to protect its bottom line.
Where is the open-source landscape headed?
Whatever the outcome, the open-source landscape is changing.
Cloud providers, especially, are having to differentiate themselves in the face of increasing competition — and open-source software offers low-hanging fruit.
If a third-party open-source software service is popular among a cloud provider’s customer base, it is wise for the cloud provider to replicate that service. Better yet, replicate every line of source code from that third-party service, give it a creative name, and make it proprietary.
That’s exactly what’s happening today. And at scale software services will continue to face the difficult decision: remain loyal to the open-source community or limit abuse from cloud providers?
There’s no clear answer at this time, but recent actions point to the latter.
Open-source software services will continue to experiment with new license types to strike the right balance between open and closed in an attempt to satisfy users without limiting future monetization opportunities.
This report was created with data from CB Insights’ emerging technology insights platform, which offers clarity into emerging tech and new business strategies through tools like:
- Earnings Transcripts Search Engine & Analytics to get an information edge on competitors’ and incumbents’ strategies
- Patent Analytics to see where innovation is happening next
- Company Mosaic Scores to evaluate startup health, based on our National Science Foundation-backed algorithm
- Business Relationships to quickly see a company’s competitors, partners, and more
- Market Sizing Tools to visualize market growth and spot the next big opportunity