Search company, investor...

VLSI Technology

Stage

Series B | IPO

Total Raised

$12M

About VLSI Technology

Manufacturer of a broad range of ASICs including cell-based, gate array, and programmable logic products.

Headquarters Location

1109 McKay Drive

San Jose, California, 95131,

United States

(408)434-3000

Loading...

Loading...

Expert Collections containing VLSI Technology

Expert Collections are analyst-curated lists that highlight the companies you need to know in the most important technology spaces.

VLSI Technology is included in 1 Expert Collection, including Semiconductors, Chips, and Advanced Electronics.

S

Semiconductors, Chips, and Advanced Electronics

6,544 items

Companies in the semiconductors & HPC space, including integrated device manufacturers (IDMs), fabless firms, semiconductor production equipment manufacturers, electronic design automation (EDA), advanced semiconductor material companies, and more

VLSI Technology Patents

VLSI Technology has filed 1 patent.

patents chart

Application Date

Grant Date

Title

Related Topics

Status

12/13/2010

12/30/2014

Electronic documents, Content management systems, Parallel computing, Records management technology, Information technology management

Grant

Application Date

12/13/2010

Grant Date

12/30/2014

Title

Related Topics

Electronic documents, Content management systems, Parallel computing, Records management technology, Information technology management

Status

Grant

Latest VLSI Technology News

Increasing AI Energy Efficiency With Compute In Memory

Nov 16, 2023

How to process zettascale workloads and stay within a fixed power budget. Skyrocketing AI compute workloads and fixed power budgets are forcing chip and system architects to take a much harder look at compute in memory (CIM), which until recently was considered little more than a science project. CIM solves two problems. First, it takes more energy to move data back and forth between memory and processor than to actually process it. And second, there is so much data being collected through sensors and other sources and parked in memory, that it’s faster to pre-process at least some of that data where it is being stored. Or looked at differently, the majority of data is worthless, but compute resources are valuable, so anything that can be done to reduce the volume of data is a good thing. In a keynote address at the recent Hot Chips 2023 conference, Google Chief Scientist Jeff Dean observed that model sizes and the associated computing requirements are increasing by as much as a factor of 10 each year. [1] And while zettascale computing (at least 1,021 operations per second) is within reach, it carries a high price tag. Case in point: Lisa Su, chair and CEO of AMD, observed that if current trends continue, the first zettascale computer will require 0.5 gigawatts of power, or about half the output of a typical nuclear power plant for a single system.⁠[2] In a world increasingly concerned about energy demand and energy-related carbon emissions, the assumption that data centers can grow indefinitely is no longer valid. And even if it was possible, the physics of interconnect speed and thermal dissipation place hard limits on data bandwidth. Simple multiplication and addition … times a billion parameters Machine learning models have massive data transfer needs relative to their modest computing requirements. In neural networks, both the inference and training stages typically involve multiplying a large matrix (A) by some input vector (αx), and adding a bias term (βy) to the result: Some models use millions or even billions of parameters. With such large matrices, reading and writing the data to be operated on may take much longer than the calculation itself. Chat GPT, the large language model, is an example. The memory-bound portion of the workload accounts for as much as 80% of total execution time.⁠[3] At last year’s IEEE Electron Device meeting, Dayane Reis, assistant professor at the University of South Florida, and her colleagues noted that table lookup operations for recommendation engines can account for 70% of execution time.⁠[4] For this reason, compute-in-memory (CIM) architectures can offer an attractive alternative. Others agree. “A number of modern workloads suffer from low arithmetic intensity, which means they fetch data from main memory to perform one or two operations and then send it back,” said James Myers, STCO program manager at imec. “In-memory compute targets these workloads specifically, where lightweight compute closer to the memory has the potential to improve overall performance and energy efficiency.” By targeting data transfer requirements, engineers can drastically reduce both execution time and power consumption. Designing efficient CIM architectures is non-trivial, though. In work presented at this year’s VLSI Symposium, researcher Yuhao Ju and colleagues at Northwestern University considered AI-related tasks for robotics applications.⁠[5] Here, general-purpose computing accounts for more than 75% of the total workload, including such tasks as trajectory tracking and camera localization. In recommendation engines, the large pool of possibilities identified through table lookups still needs to be filtered against the user query. Even when neural network calculations are clearly identifiable as a limiting factor, the exact algorithms involved differ. Neural network research is advancing faster than integrated circuit design cycles. A hardware accelerator designed for a particular algorithm might be obsolete by the time it’s actually realized in silicon. One possible solution, seen in designs like Samsung’s LPDDR-PIM accelerator module, relies on a simple, but general-purpose calculation module, optimized for matrix multiplication or some other arithmetic operation. Software tools designed to manage memory-coupled computing assume the job of effectively partitioning the workload. Fig. 1: One possible PIM architecture places a floating-point module between memory banks. Source: K. Derbyshire/Semiconductor Engineering However, using software to assign tasks adds overhead. If the same data is sent to multiple accelerators, one for each stage in an algorithm, the advantage of CIM may be lost. Another approach, proposed by the Northwestern University group, integrates a CNN accelerator with conventional general-purpose logic. The CPU writes to a memory array, which the accelerator treats as one layer of a CNN. The results are written to an output array, which the CPU treats as an input cache. This approach reduced end-to-end latency by as much as 56%. Fig. 2: A unified architecture can boost both neural network and vector calculations. Source: K. Derbyshire/Semiconductor Engineering These solutions are feasible with current device technology. They depend on conventional CMOS logic and DRAM memory circuits, where the data path is fixed once the circuit is fabricated. In the future, fast non-volatile memories could enable reconfigurable logic arrays, potentially blurring the line between “software” and “hardware.” How emerging memories help Reis and colleagues designed a configurable memory array based on FeFETs to accelerate a recommendation system. Each array can operate in RAM mode to read and write lookup tables, perform Boolean logic and arithmetic operations in GPCiM (general purpose compute-in-memory) mode, or operate in content-addressable memory (CAM) mode to search the entire array in parallel. In simulations, this architecture achieved a factor of 17 reduction in end-to-end latency, and a 713X energy improvement for queries on the MovieLens dataset. Part of the appeal of 3D integration is the potential to improve performance by increasing bandwidth and reducing the data path length. Yiwei Du and colleagues at Tsinghua University built an HfO2/TaOx ReRAM array on top of conventional CMOS logic, then added a third layer with InGaZnOx FeFET transistors. The CMOS layer served as control logic, while the FeFET layer provided a reconfigurable data path. In this design, a standard process element uses the CMOS layer and associated RRAM array to implement matrix-vector multiplication. A full network layer requires more than one process element, so the FeFET layer coordinates data transfer. Overall, the chip consumed 6.9X less energy than its two-dimensional counterpart. Networks with more complex connections between nodes achieved even more dramatic reductions.⁠[6] For several years , researchers also have been investigating the use of ReRAM arrays themselves as arithmetic elements. Under Ohm’s law, applying a current is a multiplication step (V=IR), while Kirchoff’s Law sums across an array. Operating directly on the memory array is in theory one of the most efficient possible architectures. Unfortunately, resistive losses limit the practical array size. Rather than a single array, RRAM-based computations will need to break problems into “tiles,” then combine the results.⁠[7] Memory vendors like Samsung and Hynix have been showing compute-in-memory concepts at conferences like Hot Chips for several years. As Dean pointed out, though, traditional data center metrics have devalued energy efficiency in favor of absolute performance. Such performance-first metrics are no longer sufficient in an increasingly power-constrained environment. If AI applications are to continue to grow at current rates, designers must prioritize new power-efficient architectures. References J. Dean and A. Vahdat, “Exciting Directions for ML Models and the Implications for Computing Hardware,” 2023 IEEE Hot Chips 35 Symposium (HCS), Palo Alto, CA, USA, 2023, pp. 1-87, doi: 10.1109/HCS59251.2023.10254704. L. Su and S. Naffziger, “1.1 Innovation For the Next Decade of Compute Efficiency,” 2023 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 2023, pp. 8-12, doi: 10.1109/ISSCC42615.2023.10067810. J. H. Kim et al., “Samsung PIM/PNM for Transfmer Based AI : Energy Efficiency on PIM/PNM Cluster,” 2023 IEEE Hot Chips 35 Symposium (HCS), Palo Alto, CA, USA, 2023, pp. 1-31, doi: 10.1109/HCS59251.2023.10254711. D. Reis, et al., “Ferroelectric FET Configurable Memory Arrays and Their Applications,” 2022 International Electron Devices Meeting (IEDM), San Francisco, CA, USA, 2022, pp. 21.5.1-21.5.4, doi: 10.1109/IEDM45625.2022.10019490. Y. Ju, et al., “A General-Purpose Compute-in-Memory Processor Combining CPU and Deep Learning with Elevated CPU Efficiency and Enhanced Data Locality,” 2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Kyoto, Japan, 2023, pp. 1-2, doi: 10.23919/VLSITechnologyandCir57934.2023.10185311. Y. Du et al., “Monolithic 3D Integration of FeFET, Hybrid CMOS Logic and Analog RRAM Array for Energy-Efficient Reconfigurable Computing-In-Memory Architecture,” 2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Kyoto, Japan, 2023, pp. 1-2, doi: 10.23919/VLSITechnologyandCir57934.2023.10185221. G. W. Burr et al., “Phase Change Memory-based Hardware Accelerators for Deep Neural Networks (invited),” 2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits), Kyoto, Japan, 2023, pp. 1-2, doi: 10.23919/VLSITechnologyandCir57934.2023.10185411.

VLSI Technology Frequently Asked Questions (FAQ)

  • Where is VLSI Technology's headquarters?

    VLSI Technology's headquarters is located at 1109 McKay Drive, San Jose.

  • What is VLSI Technology's latest funding round?

    VLSI Technology's latest funding round is Series B.

  • How much did VLSI Technology raise?

    VLSI Technology raised a total of $12M.

  • Who are the investors of VLSI Technology?

    Investors of VLSI Technology include Advanced Technology Ventures.

Loading...

Loading...

CBI websites generally use certain cookies to enable better interactions with our sites and services. Use of these cookies, which may be stored on your device, permits us to improve and customize your experience. You can read more about your cookie choices at our privacy policy here. By continuing to use this site you are consenting to these choices.