With over 175 zettabytes of data expected by 2025, data centers will continue to play a vital role in the ingestion, computation, storage, and management of information.
Often hidden in plain sight, data centers are the backbone of our internet. They store, communicate, and transport the information we produce every single day. The more data we create, the more vital our data centers become.
But many of today’s data centers are clunky, inefficient, and outdated. To keep them running, data center operators, from FAMGA to colocation providers, are working on upgrading them to fit our ever-changing world.
In this report, we take a deep dive into the many aspects of data centers and how they’re evolving, from where and how they’re built, to the energy they’re using, to the hardware that operates inside them.
Table of Contents
Access to fiber networks, the price of energy sources, and the surrounding environment all play a role in choosing where to build data centers.
By some estimates, the global data center construction market might be worth up to $57B by 2025. It’s a big enough challenge that commercial real estate giant CBRE has launched a whole division specifically focused on data centers.
Source: Data Center Map
Building near cheap power sources
Placing energy-hungry data centers next to cheap power sources can make them more affordable to run. This can also be a cleaner option. As emissions rise and big tech companies increasingly get flack for using dirty energy to power data centers, clean sources are an important consideration.
Both Apple and Facebook have built data centers near hydropower resources. In central Oregon, Apple bought a hydroelectric project to supply its Prineville data center with clean power. Apple has pointed to Oregon’s deregulated electricity markets as a prime reason that it’s created several data centers in the area. The deregulation gives Apple the option to buy power directly from third-party renewable providers instead of being forced to buy from local utilities.
In Luleå, Sweden, Facebook has built a mega-data center next to a hydroelectric plant. Including Sweden, data centers have started cropping up across the Nordic region, where the climate is cool, the geography is stable and not prone to earthquakes, and renewable resources like hydropower and wind are plentiful.
Facebook’s Luleå, Sweden data center in construction in 2012. Source: Facebook
Several big data centers have moved into or are eyeing the Nordic region.
In December 2018, Amazon Web Services announced the opening of its AWS Europe Region data center, about an hour away from Stockholm. In the same month, Microsoft acquired 130 hectares of land in two neighboring parts of Sweden, Gävle and Sandviken, with the intention of developing data centers there.
In October 2017, Google bought 109 hectares of land in rural Sweden, announcing that the land may be used as a location for a data center.
In addition to proximity to cheap, clean energy sources, companies building data centers may also look for cooler climates. Places near the Arctic Circle, like Northern Sweden, can allow data centers to save energy on cooling.
Telecom company Altice Portugal reports that its Covilhã data center uses outside air to cool its servers 99% of the time. And an older Google data center in Hamina, Finland uses seawater from the Bay of Finland to cool the facility and reduce energy use.
Data center company Verne Global has opened a campus in Iceland that taps into local geothermal and hydroelectric sources. It’s on a former NATO base and located midway between Europe and North America — the world’s two largest data markets.
Building in emerging markets
Placing a data center in a hotspot of emerging internet traffic can ease congestion and speed up service to a region.
For example, in 2018, China’s Tencent, which owns Fortnite and WeChat, placed data centers in Mumbai. The move was a sign that the region was booming in internet usage and Tencent’s gaming platforms were growing more popular.
Building data centers in areas where internet use is booming can also be a strategic business move. If local businesses are growing, they might consider moving their operations to a nearby data center.
Data centers that rent out their services to other customers are known as colocation facilities. These centers provide a data center building with cooling, power, bandwidth, and physical security, then lease out this space to customers, who add servers and storage. Colocation facilities generally focus on companies with smaller data center needs and help companies save money on infrastructure.
Tax incentives & local laws
Data centers represent a significant new source of revenue for electricity producers — and governments are offering incentives to attract development.
Beginning in January 2017, the Swedish government placed a 97% tax cut on any electricity used by data centers. Sweden’s electricity is relatively expensive, and the tax cut placed Sweden on par with other Nordic countries.
In December 2018, Google negotiated a 100%, 15-year data center sales tax exemption in New Albany, Ohio for a $600M data center. The exemption came with the possibility of renewing through 2058. And in September 2018, France, post-Brexit and hoping to attract economic talent and global capital, announced that it has plans to cut taxes on electricity consumption for data centers.
In some cases, building a data center is a way for companies to continue to operate in countries with strict walls.
In February 2018, Apple started storing data in a center based in Guizhou, China in order to comply with local laws. While Chinese authorities previously had to go through the US legal system to gain access to Chinese citizens’ information stored in a US-based based data center, a local data center grants authorities easier and quicker access to Chinese citizens’ information stored in Apple’s cloud.
Connection to fiber & security
Two more important considerations when it comes to data center locations are connectivity to fiber networks and tight security.
For many years, fiber networks have been a key factor in choosing where to build data centers. While data is often delivered to mobile devices via wireless networks or local Wi-Fi, that’s usually just the last part of the journey.
Most of the time, data travels to and from data storage facilities through fiber-optic cables — making fiber the nervous system of the internet. Fiber connects data centers to cellular antennas, in-home routers, and other data storage facilities.
Source: Submarine Cable Map
Ashburn, Virginia and its neighboring regions have emerged as a large market for data centers, largely due to the area’s large network of fiber infrastructure, which was put in place by internet company AOL when it built its headquarters there. When other companies like Equinix moved into the region and built their own data centers, the area’s fiber networks continued to grow — and attract even more new data centers.
Facebook has invested nearly $2B into a data center in Henrico, Virginia, and in January 2019 Microsoft received approval on a $1.5M grant to expand its Southside Virginia data center for the sixth time.
And dependence on fiber-optic cable grows, in both developed and emerging markets, capital expenditures on the infrastructure are expected to reach nearly $150B by 2019.
But new alternatives to fiber networks are emerging, as big tech is builds out its own connectivity infrastructure.
In May 2016, Facebook and Microsoft announced that they were working together to build an underwater cable between Virginia Beach, Virginia and Bilbao, Spain. In 2017, Facebook announced further plans to build its own 200-mile long fiber network in the ground in Las Lunas, New Mexico to connect its planned New Mexico data center to other server farms. The underground fiber system will create three unique network paths for information to travel to Las Lunas.
In addition to connectivity, security is another important consideration. This is especially important for data centers that house sensitive information.
For example, Norwegian financial giant DNB partnered with Green Mountain Data Centre to establish its data center. Green Mountain placed DNB’s data center in its high security facility, a converted bunker inside a mountain. Green Mountain claims that the mountain is entirely impenetrable “against all kinds of dangers, including intruders, terrorist attacks, volcanoes, storms, earthquakes, and crime.”
The “Swiss Fort Knox” data center is located underneath the Swiss Alps, with a door camouflaged to look like a rock. Its complex internal setup includes many tunnels that are only accessible with proper clearance. The data center is protected with emergency diesel engines and an air-pressure system that prevents any poisonous gases from entering the premises.
The ‘Fort Knox of Sweden’ is both physically and digitally hyper-secure. Source: Mount10
Due to the high-security nature of the servers at certain data centers, there are plenty of data centers whose locations have never been revealed. For these centers, location information can be wielded as a weapon.
In October 2018, WikiLeaks published an internal list of AWS facilities. Among the information revealed was the fact that Amazon was a contender to build a private cloud worth nearly $10B for the Department of Defense.
WikiLeaks itself has had difficulty finding the right data center to host its information. AWS stopped hosting WikiLeaks because the company had violated the company’s terms of service — namely, releasing documents that it did not own the rights to and putting lives at risk.
WikiLeaks has subsequently moved to many different data centers. At one point, it was even rumored that WikiLeaks was considering placing its data center in the ocean, at self-appointed state Sealand in the North Sea.
While location is arguably the best way to mitigate the risk of data center outages, data centers’ structures also play an important role in ensuring reliability and longevity.
The construction of a data center can make it resistant to seismic activity, flooding, and other types of natural disasters. In addition, certain designs can accommodate expansion, reduce energy costs, or allow new types of uses.
Across industries — from healthcare to finance to manufacturing — companies rely on data centers to support the growing creation and consumption of data. In some cases, these data centers may be proprietary and located on-premise, while others may be shared and located remotely.
Either way, data centers are at the heart of the world’s growing technological focus and continue to experience physical transformations of their own. According to the CB Insights Market Sizing tool, the global data center services market is estimated to reach $228B by 2020.
One of the more recent transformations in data center construction is size. Some data centers have become smaller and more distributed (these are referred to as edge data centers). At the same time, other data centers have also become larger and more centralized than ever before (mega data centers).
In this section, we explore both edge and mega data centers in addition to a number of other forward-thinking data center designs.
Edge Data Centers
Small, distributed data centers, called edge data centers, are being deployed to provide hyper-local storage and processing capacity at the edge of the network.
While cloud computing has traditionally served as a reliable and cost-effective means for connecting many devices to the internet, the continuous rise of IoT and mobile computing has put a strain on networking bandwidth.
Edge computing technology is now emerging to offer an alternative solution.
This involves placing computing resources closer to where data originates (i.e. motors, pumps, generators, or other sensors) — or the “edge.” Doing so reduces the need to transfer data back and forth between centralized computing locations such as the cloud.
While still nascent, the technology already provides a more efficient method of data processing for numerous use cases, including autonomous vehicles. Tesla cars, for example, have powerful onboard computers which allow for low-latency data processing (in near real-time) for data collected by the vehicle’s dozens of peripheral sensors. This provides the vehicle with the ability to make timely, autonomous driving decisions.
However, other edge technologies, such as wireless medical devices and sensors, lack the necessary compute capacity to process large streams of complex data directly.
As a result, smaller, modular data centers are being deployed to provide hyper-local storage and processing capacity at the edge. According to CB Insights Market Sizing tool, the global edge computing market is estimated to reach $34B by 2023.
These data centers, which are typically the size of a shipping container, are placed at the base of cell towers or as close to the origination of data as possible.
In addition to transportation and healthcare, these modular edge data centers are being used in industries such as manufacturing, agriculture, and energy & utilities. They are also helping mobile network operators (MNOs) deliver content faster to mobile subscribers, while many tech companies leverage these systems to store (or cache) content closer to their end users.
Vapor IO is one company offering colocation services at the edge by placing small data centers at the base of cell towers. The company has a strategic partnership with Crown Castle, the largest provider of wireless infrastructure in the US.
Source: Data Center Knowledge
Other edge data center businesses, such as EdgeMicro, deliver micro data centers that connect Mobile Network Operators (MNOs) with Content Providers (CPs) at the edge. EdgeMicro’s founders bring executive experience from organizations like Schneider Electric, one of Europe’s largest energy management companies, and CyrusOne, one of the largest and most successful data center service providers in the US.
The company recently unveiled its first production unit and plans to sell its colocation services to content providers like Netflix and Amazon, which will benefit from greater content delivery speeds and reliability. These colocation services are ideal for businesses that seek to own, but not manage, their data infrastructure.
Source: Edge Micro
And startups aren’t the only ones getting involved in the edge data center market.
Larger companies like Schneider Electric are not only partnering with startups but also developing edge data center products of their own. Schneider offers a number of different prefabricated modular data centers, which are ideal for a variety of industries that need hyper-local compute and storage capacity.
Huawei offers a similar prefabricated all-in-one modular unit that can be deployed in a variety of different environments for a variety of use cases.
These types of solutions integrate power, cooling, firefighting, lighting, and management systems into a single enclosure. They are designed for fast deployment, reliable operation, and remote monitoring.
CB Insights expert intelligence customers can learn more about edge data centers in our brief, Beyond The Cloud: 5 Startups Bringing Data Centers To The Edge.
Mega Data Centers
On the other end of the spectrum are mega data centers — data centers with at least 1M square feet of data center space. These facilities are large enough to serve the needs of tens of thousands of organizations at once and benefit greatly from economies of scale.
While these mega data centers are expensive to build, the cost per square foot is far better than that of an average data center.
One of the largest projects is a 17.4M square foot facility built by Switch Communications, which provides businesses with housing, cooling, power, bandwidth, and physical security for their servers.
In addition to this enormous “Citadel” campus in Tahoe Reno, the company has a 3.5M square foot data center in Las Vegas, a 1.8M sq ft campus in Grand Rapids, and a 1M sq ft campus in Atlanta. The Citadel campus is the largest colocation data center in the world, according to the company’s website.
Big tech has also been actively building mega data centers in the US, with Facebook, Microsoft, and Apple all building facilities to support their growing data storage needs.
For example, Facebook is building a 2.5M square foot data center in Fort Worth, Texas to process and store its troves of personal user information. Originally, the data center was expected to be just 750K square feet, but the social networking company decided to triple its size.
The new data center will cost approximately $1B and sit on a 150-acre plot of land, which will provide an opportunity for future expansion. In May 2017, 440K square feet of the server space became operational.
One of Microsoft’s most recent investments is a data center cluster in West Des Moines, Iowa, which cost the company a combined $3.5B. Together, this cluster or data centers totals 3.2M square feet of space, with the largest data center accounting for 1.7M square feet. This specific data center, called Project Osmium, is located on 200-acres of land and is expected to be complete by 2022.
Iowa has become a popular location for data centers in recent years due to its low energy costs (some of the lowest in the nation) and low risk of natural disaster.
Apple is also building a facility in Iowa, albeit a smaller one: the facility will measure 400K square feet and will cost $1.3B to build.
Source: Apple Newsroom
While some of Apple’s facilities are built from the ground up to manage all its content — the app store, music streaming service, iCloud storage, user data — other structures have been re-purposed to house servers.
For example, Apple repurposed a 1.3M square foot solar panel manufacturing facility in Mesa, Arizona which opened in August 2018. The new data center runs on 100% clean energy thanks to a nearby solar farm.
Outside of the US, specific regions that have attracted these mega data centers include northern Europe, which has been a popular place for tech giants to build due to its cool temperature and increasing tax cuts.
Hohhot, China, located in Inner Mongolia, also offers a good location for mega data centers with access to cheap, local energy, cool temperatures, and nearby university talent (from Inner Mongolia University).
While mega data centers have become a theme throughout China, the Inner Mongolia region has become the hub for these types of developments. China Telecom (10.7M SF), China Mobile (7.8M SF), and China Unicom (6.4M SF) have all established mega data centers in this region.
Source: World’s Top Data Centers
CB Insights expert intelligence customers can learn more about mega data centers in our brief, Where Data Lives: 10 Companies Behind The World’s Largest Data Centers.
data Center innovation
In addition to data centers that are larger — and smaller — than ever before, new data center structures designed to benefit from their surrounding environments are another emerging trend.
Today, many organizations are experimenting with data centers that operate in and on the ocean. These structures use their environment, and its resources, to naturally cool the servers at low cost.
One of the most progressive examples of using the ocean for natural cooling is Microsoft’s Project Natick. As highlighted in a 2014 patent application titled Submerged Datacenter, Microsoft submerged a small cylindrical data center of the coast of Scotland.
The data center runs 100% of locally produced renewable electricity from on-shore wind and solar as well as off-shore tide and wave sources. It uses the ocean to dissipate infrastructure heat exhaust.
While still in early stages, Microsoft has made significant strides with this project, and could be the first sign of widespread oceanic data centers in the future.
Google has also experimented with oceanic data centers. In 2009, the company filed a patent for a Water-Based Data Center which envisions data centers atop floating barges.
Similar to Microsoft’s design, the structure would use the surrounding water to naturally cool the facility. In addition, the barge could also generate energy from oceanic currents.
While it’s Google has not reported whether it’s tested these structures, the company was thought to be responsible for a floating barge in the San Francisco Bay back in 2013.
Startup Nautilus Data Technologies has been executing on a very similar vision. The company has raised $58M in recent years to bring this floating data center vision to reality.
Source: Business Chief
The focus of the company is less on the vessel itself and more on the technology required to power and cool the data centers. Nautilus’ primary goals are to reduce the cost of computing, cut power usage, eliminate water consumption, decrease air pollution, and lower greenhouse gas emissions.
Energy & efficiency
Currently, 3% of all electricity consumption globally comes from data centers — and that number is only set to grow. And this electricity isn’t always clean: according to the United Nations, the information & communications technology (ICT) sector, fueled largely part by data centers, contributes the same amount of greenhouse gases as the aviation sector does from fuel.
While ICT has managed to curb electricity consumption growth in part by shutting down old, inefficient data centers, this strategy can only go so far as internet consumption and data production only increases.
Going forward, there are two key ways for data centers to reduce emissions: make energy usage inside a data center more efficient, and make sure that the energy being used is clean.
Reducing consumption & Boosting energy efficiency
In 2016, DeepMind and Google worked on an AI recommendation system that would help make Google’s data centers more energy efficient. The focus was on minute improvements: even small changes were enough to help with significant energy savings and reduce emissions. However, once the recommendations came into place, implementing them required too much operator effort.
Data center operators asked if these could be implemented autonomously. However, per Google, no AI system is yet ready to entirely control a data center’s cooling and heating processes.
Google’s current AI control system works on specific actions but can only implement them under constraints that prioritize safety and reliability. According to the company, the new system delivers an average of 30% on energy savings.
A typical day of PUE (power usage effectiveness) with ML turned on and off. Source: DeepMind
Google has also used an AI system to adjust efficiency in a Midwest data center during a tornado watch. While a human operator might have focused on preparing for the storm, the AI system took advantage of tornado conditions — like drops in atmospheric pressure and changes in temperature and humidity — to tweak the data center’s cooling systems for prime efficiency during the tornado times. During winter, the AI control system adjusted to the weather to reduce the energy needed to cool the data center.
However, there are issues with this approach. Gaps in AI technology make it difficult to allow a data center to make efficiency decisions easily, and AI can be very difficult to scale. Each of Google’s data centers is unique, and it’s difficult to roll an AI tool out across all of them at once.
Another way to improve efficiency is to change how parts of a data center that overheat — like servers or certain types of chips — are cooled.
One such method involves using liquid instead of air to cool parts. Google CEO Sundar Pichai has said that the company’s newly released chips are so powerful that the company has had to submerge them in liquid in order to cool them properly.
Some data centers are experimenting with submerging data centers underwater to make cooling easier and more energy efficient. This allows data centers to have all-the-time access to naturally cool deep sea water, so heat is transferred from equipment into the surrounding ocean. Because a data center can be placed off any coast, it can elect to be connected to clean energy — like Microsoft’s 40-foot-long Project Natick, which is powered by wind from the Orkney Islands grid.
An advantage that smaller underwater data centers add is that they can be modular, which makes them easier to deploy than new centers on land.
However, despite these advantages, some experts have warned against sinking data centers underwater. Because the cooling process involves pulling in clean water and releasing hot water into the surrounding region, these data centers can make the sea warmer, affecting local sea life. And although underwater data centers like Project Natick are designed to operate without human supervision, they can be difficult to fix if something goes wrong.
In the future, data centers might also contribute to clean energy and efficiency by recycling some of the electricity they produce. Some projects are already exploring the possibility of thermal reuse. Nordic data center company DigiPlex has partnered with heating and cooling provider Stockholm Exergi to create a heat reuse system. The solution will collect excess heat in the data center and send it to the local district heating system, to potentially heat up to 10,000 Stockholm residences.
Buying cleaner energy
The ICT industry has made some strides in clean energy, and currently closes nearly half of all corporate agreements to buy renewable electricity, according to the IEA.
As of 2016, Google is the largest purchaser of corporate renewable energy on the planet.
Source: Google Sustainability
In December 2018, Facebook bought 200MW of energy from a member-owned solar cooperative. In Singapore, where land space is a limitation, Google has announced plans to buy 60MW worth of rooftop solar energy.
But while big companies like Google and Facebook have the resources to buy large amounts of renewable energy, this can be difficult for smaller companies, which may not need massive amounts of energy.
Another more expensive energy option is microgridding: installing an independent energy source inside a data center.
One example of this strategy is fuel cells, like those sold by recently public Bloom Energy. The cells use hydrogen as fuel to create electricity. While these alternative sources are often used as backup energy, a data center could rely entirely on these microgrids to survive — though it would be an expensive option.
Another way that data centers are exploring clean energy is through renewable energy credits (RECs), which represent a certain amount of clean energy and are typically used to balance out the “dirty energy” that a company has used. For any amount of dirty energy produced, RECs represent the production of an equivalent amount of clean energy in the world. These REC-backed amounts are then sold back into the renewable energy market.
However, there are challenges with the REC model. For one thing, RECs only offset dirty energy — they don’t mean a data center is powered by clean energy. While the REC method is usually easier than finding accessible, affordable sources that can provide enough energy to meet the needs of a data center, it’s generally not accessible for smaller companies, who don’t always have the capital to bet on the fluctuations of a solar or wind farm.
In the future, technology will make it easier for small producers and buyers to work closely together. Marketplace models and renewable energy portfolios (like those created by mutual funds) that assess and quantify the fluctuations of renewable energy are emerging.
Solid State Drives
Solid State Drives, or SSDs, are a form of mass storage device that supports the reading and writing of data. These devices can store data without power — this is known as persistent storage. This differs from temporary forms of data storage, such as Random Access Memory (RAM), which only retain information while a device is operating.
The SSD has come to rival the Hard Drive Disk (HDD), another form of mass storage device. The primary difference between the SSD and HDD is that SSDs function without moving parts. This allows SSDs to be more responsive, more durable, and lighter, among other advantages.
Source: Silicon Power Blog
And while SSDs have become standard for laptops, smartphones, and other devices with slim profiles, SSDs are less practical, and therefore less common, in large data centers, due to their higher price point.
Source: Seagate & IDC
By the end of 2025 over 80% of enterprise data storage capacity shipped will remain as HDDs, according to the IDC. But beyond 2025, SSDs may become the storage hardware of choice for enterprises and their data centers.
As SSDs become widespread within consumer electronics and IoT devices, greater demand could induce greater supply, ultimately lowering costs.
In contrast to the newer SSDs, cold storage leverages many older technologies, such as CD-Rs and magnetic tapes, to store data using as little energy as possible.
With cold storage, accessing data takes much longer than with hot storage (such as SSDs). Only infrequently used data should be stored in these types of systems.
That said, the world’s biggest tech companies — like Facebook, Google, and Amazon — all use cold storage to store even the most granular user data points. Cloud providers like Amazon, Microsoft, and Google also offer these services to customers who wish to store data at low costs.
According to the CB Insights Market Sizing tool, the cold storage market is expected to reach nearly $213B by 2025.
No business wants to find itself behind the curve in data collection. Most organizations would argue that it’s better to collect as much data as possible today, even if they have yet to determine how it will be used tomorrow.
This type of unused data is referred to as dark data. This is data that is collected, processed, and stored, but not typically used for a specific purpose. According to estimates from IBM, approximately 90% of sensor data collected from the internet of things devices is never used.
Down the road, there may be better ways to reduce the overall quantity of dark data collected and stored. But currently, even with advances in artificial intelligence and machine learning, businesses are still incentivized to collect and store as much data as possible, to capitalize on future data-driven opportunities.
So for the foreseeable future, the best way to store this information is at the lowest possible cost. With cold storage, businesses can do just that. This trend will continue as users produce more data, and organizations collect it.
Other Forms of Data Storage
In addition to SSDs, HDDs, CDs, and magnetic tapes, there are a number of emerging storage technologies that promise greater storage capacity per unit.
One technology that has seen significant promise is heat-assisted magnetic recording, or HAMR. HAMR greatly increases the amount of data that can be stored on devices, such as HDDs, by heating the surface of a disk with a high-precision laser during writing. This allows for more accurate and stable data writing, which increases storage capacity.
This technology is expected to produce to 20TB of HDD storage on one 3.5-inch drive by 2020, and will increase capacity by 30% — 40% each year thereafter. Seagate has already created a 16TB 3.5-inch drive, which it announced in December 2018.
Seagate has been active in patenting this HAMR technology over the last decade. In 2017, the company filed for a record number of patents related to the technology.
(Note: There is a 12- to 18-month delay between when patents are filed and when they are published, so 2018 may see an even greater number of total HAMR patents.)
Significantly, the costs of this technology are reasonable compared to the standard HDDs offered today. This is due to the fact that much of the technology remains the same — the primary innovation is in the laser used to heat the magnetic disk surface.
Storage companies such as Western Digital are developing similar technologies. However, Western Digital uses microwave-assisted magnetic recording (MAMR), which uses microwaves instead of lasers. (This is likely a result of Seagate’s moat of HAM -specific intellectual property.)
Moving forward, we should expect to see the storage capacities of HDDs getting significantly larger thanks to heat-assisted recording technologies.
In addition, there are a number of other emerging data storage concepts that show promise. One is Liquid-State Storage, which stores data in a liquid called vanadium dioxide. According to estimates, nearly 1TB of data could be stored in a single tablespoon of liquid.
In this section, we dive into the emerging trends and technologies that could shape the future of data centers.