Search company, investor...

Founded Year



Series E | Alive

Total Raised




Last Raised

$325M | 3 yrs ago



About Scale

Scale provides a data engine platform. The platform provides generative artificial intelligence (AI) strategy including fine-tuning, prompt engineering, security, model safety, model evaluation, and enterprise applications. It serves industries such as retail, electronic commerce, logistics, and more. It was founded in 2016 and is based in San Francisco, California.

Headquarters Location

303 2nd Street South Tower, Floor 5

San Francisco, California, 94107,

United States



Scale's Product Videos

ESPs containing Scale

The ESP matrix leverages data and analyst insight to identify and rank leading companies in a given technology landscape.

Enterprise Tech / Data Management

​​The machine learning training data curation market offers solutions to support data quality control in the AI algorithm training process. These solutions help organizations complete key tasks, such as selecting the best subsets of data for training models, triaging datasets for bias, and identifying labeling errors. Ultimately, these solutions help minimize the downstream effects of poor-quality…

Scale named as Leader among 9 other companies, including Snorkel AI, Lightly, and Peroptyx.

Scale's Products & Differentiators

    Scale Rapid

    Scale Rapid is self-serve, on-demand labeling to train AI with high quality ground truth data and enables machine learning engineers and researchers to receive high quality labels and instruction feedback in a matter of hours and scale to production volumes in days. With Rapid, customers can create their labeling projects, upload data labeling via UI or API, design and submit their labeling instructions and direct quality improvements with new or updated evaluation tasks. This allows customers to unlock early AI/ML prototyping as well as more insight and control over the labeling workflow, as customers can receive prompt feedback on labeling instructions and potential edge cases for rapid iteration.


Expert Collections containing Scale

Expert Collections are analyst-curated lists that highlight the companies you need to know in the most important technology spaces.

Scale is included in 6 Expert Collections, including Auto Tech.


Auto Tech

2,476 items

Companies working on automotive technology, which includes vehicle connectivity, autonomous driving technology, and electric vehicle technology. This includes EV manufacturers, autonomous driving developers, and companies supporting the rise of the software-defined vehicles.


Unicorns- Billion Dollar Startups

1,228 items


Artificial Intelligence

10,958 items

Companies developing artificial intelligence solutions, including cross-industry applications, industry-specific products, and AI infrastructure solutions.


Tech IPO Pipeline

539 items

Track and capture company information and workflow.


AI 100

100 items

Winners of CB Insights' 5th annual AI 100, a list of the 100 most promising private AI companies in the world.


Generative AI

414 items

Companies working on generative AI applications and infrastructure.

Scale Patents

Scale has filed 10 patents.

The 3 most popular patent topics include:

  • artificial neural networks
  • artificial intelligence
  • computer vision
patents chart

Application Date

Grant Date


Related Topics





Application Date


Grant Date



Related Topics



Latest Scale News

Rethinking How Data is Stored and Processed Brings Scale and Speed to Modern Data-Intensive Applications

Nov 28, 2023

Rethinking How Data is Stored and Processed Brings Scale and Speed to Modern Data-Intensive Applications Key-value databases are at the forefront of many modern data-intensive business applications – and are widely adopted across several industry verticals including e-commerce, online gaming, content delivery networks, social networking, and messaging services. IT spending for these revenue-generating applications is tied to a certain percentage topline revenue of the organization. This highlights how much importance is given to designing such applications for superior performance and exceptional customer experience. Key-value databases provide a simple and highly flexible interface for building caching, distributed storage, file systems, and database systems. Enterprises need to select key-value databases (and their architecture setup) that are appropriate for their business application. The server, storage and storage engine design choice should allow applications to focus on addressing business problems without worrying about the performance or scalability of the application to meet peak demands. While businesses rely heavily on the key-value databases, they continue to encounter challenges. These challenges typically fall  into two categories: System design and storage System Design & Storage Challenges High Memory Usage: Some key-value databases store all the datasets in-memory, which leads to memory-intensive data processing. When the application’s working set of indexes and most frequently accessed data exceeds available memory, disk performance quickly becomes the limiting factor for throughput. At the same time, it can be challenging to ensure that performance scales with user demand and doesn’t outpace cost projections. Cost Problems at Scale: For persistent key-value databases with large-scale deployments, the cost of storage can become significant. Efficiently optimizing storage without compromising performance becomes a concern, especially in cloud environments where costs are directly tied to storage consumption. Data Compression: Key-value databases might need to employ compression techniques to reduce storage overhead. However, the choice of compression algorithm can impact CPU spikes and performance degrades, especially when frequent decompression is required. Data Fragmentation: As data gets modified, updated or deleted, over time storage can become fragmented. This leads to space amplification with inefficient disk space usage and increased I/O operations during reads, resulting in read amplification. Scalability Issues: Even though key-value stores are designed for high scalability, there can be challenges with increased user concurrency and when scaling out, such as managing data consistency across distributed systems or handling partitioning and replication. Employing Multiple Key-Value Stores Organizations often have many separate key-value systems owned by different teams with different API feature sets and optimization parameters and techniques. This results in duplicated development efforts, high operational overhead and incident counts, and confusion among engineering teams’ customers. For example, organizations using storefront applications will store user login, clickthrough, and preferences in a key-value store. Gaming or learning applications use key values for leaderboard applications. Real-time recommendation and ad-tech applications use KV stores. Caching applications use KV store to store data that doesn’t change often ranging from seconds to a few days. While SQL provides a standard query language for relational databases, no such universally accepted standard exists for key-value stores, thus creating a lack of standardization. This can make switching between different key-value databases a more significant challenge. The answer? Special-purpose hardware accelerators for key-value databases. All of this has combined to create an immediate need for a new generation of hardware-accelerated data processing and storage management technology. From GPUs and TPUs built for the most demanding AI and Gen AI models to DPUs that enable a workload-optimized approach, the creation of dedicated processors for data-intensive tasks is not a new concept. Dedicated key-value accelerators that combine hardware and software components to address ever-growing performance, scalability, and data growth demands are needed. The ultimate objective is to achieve accelerated application performance, enabling data growth with built-in compression, optimized data flow to increase SSD endurance and prolonged usable life and bringing down the cost of performance and capacity scaling. Such solutions should enable key value database applications to exploit the benefits of modern SSD storage performance to their full potential. They should address data growth challenges with efficient data compression algorithms by offloading CPU compression usage. The architecture should facilitate scaling to several hundred terabytes up to petabytes of storage, supporting the management of dozens of billions to trillions of key-value pairs. These accelerators should employ an open standards-based approach – take RocksDB for example. RocksDB serves as the foundation for numerous applications, such as Redis, MyRocks, Kafka Streams, Spark structured streaming, TiKV, KVRocks, ArangoDB, and many others. Enterprises should be able to easily migrate to and from the platform with no vendor lock-in. The needed architecture would embrace adaptability for future innovations. Ultimately, key value accelerator systems should empower enterprises to focus on what matters most; application and business growth – and free them from concerns of storage management and performance. About the Author Prasad Venkatachar is Sr Director – Products & Solutions at Pliops. Prasad is an experienced IT professional with 20 years of combined experience in product strategy and management, marketing, solution architecture, and IT services. In these 20 years of progressive experiences he has launched multiple industry-leading databases, data warehouses, data lake & AI/ML products and solutions collaborating with Microsoft, IBM, Oracle, Google, MongoDB, Cloudera, and ISV partners. He has served Fortune 500 enterprise customers as SME to deliver business value outcomes through technical and financial benefits for data center and cloud deployments. He has also served as Microsoft Data and AI Partner Advisory Council and a Member of Lenovo Technology Innovation. Sign up for the free insideBIGDATA  newsletter . Join us on Twitter: Join us on LinkedIn: Join us on Facebook:

Scale Frequently Asked Questions (FAQ)

  • When was Scale founded?

    Scale was founded in 2016.

  • Where is Scale's headquarters?

    Scale's headquarters is located at 303 2nd Street, San Francisco.

  • What is Scale's latest funding round?

    Scale's latest funding round is Series E.

  • How much did Scale raise?

    Scale raised a total of $602.82M.

  • Who are the investors of Scale?

    Investors of Scale include Y Combinator, Index Ventures, Founders Fund, Coatue, Tiger Global Management and 21 more.

  • Who are Scale's competitors?

    Competitors of Scale include Aindo, Deasie, Fastagger, Cleanlab, Ango AI and 7 more.

  • What products does Scale offer?

    Scale's products include Scale Rapid and 4 more.

  • Who are Scale's customers?

    Customers of Scale include Brex, iRobot, Flexport, Toyota Research Institute (TRI) and U.S. Air Force.


Compare Scale to Competitors

Labelbox Logo

Labelbox offers a training data platform for machine learning teams to build real-world artificial intelligence (AI) solutions. The platform consists of label editor tools for batch, and real-time labeling workflows, collaboration, quality review, analytics, and more. It serves the government, retail, insurance, manufacturing, and healthcare sectors. It was founded in 2018 and is located in San Francisco, California.

Snorkel AI Logo
Snorkel AI

Snorkel AI develops a system for programmatically building and managing training datasets. The company's platform allows users to develop training datasets and reduces the time, cost, and friction of labeling training data. It was founded in 2015 and is based in Redwood City, California.

CloudFactory Logo

CloudFactory is a company that focuses on providing workforce solutions for machine learning and business process optimization. The company offers services such as data labeling, accelerated annotation, and human-in-the-loop automation, which support workflows and fill gaps in AI and automation. CloudFactory primarily serves sectors such as the autonomous vehicles industry, finance, healthcare, insurance, and retail. It was founded in 2010 and is based in Reading, England. Logo is a company that focuses on providing high-quality and ethical data for Artificial Intelligence applications, operating within the AI data marketplace industry. The company offers a wide range of pre-collected and structured training datasets, including text, voice, and image data, which are used to train AI models. These datasets are available for purchase, sale, or commission through their online marketplace. Additionally, provides professional services to assist with complex machine learning projects. It was founded in 2015 and is based in Seattle, Washington.

Alegion Logo

Alegion is a company that focuses on data annotation and collection services, operating within the artificial intelligence and machine learning industry. The company offers services such as data collection, data annotation, and quality control, aimed at transforming unstructured data into high-quality, model-ready training data. Alegion primarily serves sectors such as healthcare, hospitality, insurance, manufacturing, retail, security, software, and sports. It was founded in 2012 and is based in Austin, Texas.

Sama Logo

Sama provides training data for artificial intelligence (AI) and machine learning (ML) algorithms. It focuses on video, image, language, and sensor data annotations and validation. The company provides its service to a wide range of industries such as media, technology, retail, agriculture, and transportation. It was formerly known as Samasource. The company was founded in 2008 and is based in San Francisco, California.


CBI websites generally use certain cookies to enable better interactions with our sites and services. Use of these cookies, which may be stored on your device, permits us to improve and customize your experience. You can read more about your cookie choices at our privacy policy here. By continuing to use this site you are consenting to these choices.