PICO-RAM: A PVT-Insensitive Analog Compute-In-Memory SRAM Macro with In-Situ Multi-Bit Charge Computing and 6T Thin-Cell-Compatible Layout

Read original: arXiv:2407.12829 - Published 7/19/2024 by Zhiyu Chen, Ziyuan Wen, Weier Wan, Akhil Reddy Pakala, Yiwei Zou, Wei-Chen Wei, Zengyi Li, Yubei Chen, Kaiyuan Yang
Total Score

0

PICO-RAM: A PVT-Insensitive Analog Compute-In-Memory SRAM Macro with In-Situ Multi-Bit Charge Computing and 6T Thin-Cell-Compatible Layout

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

• This paper presents a novel SRAM (static random-access memory) macro called PICO-RAM that enables analog compute-in-memory capabilities.

• PICO-RAM is designed to be insensitive to process, voltage, and temperature (PVT) variations, and it supports in-situ multi-bit charge computing and a 6T (six-transistor) thin-cell-compatible layout.

• The key innovations of PICO-RAM include a PVT-insensitive analog compute-in-memory circuit, an in-situ multi-bit charge computing scheme, and a 6T thin-cell-compatible layout that enables high-density integration.

Plain English Explanation

PICO-RAM is a new type of computer memory that can not only store data but also perform computations directly on that data. This is known as "compute-in-memory" and can be very efficient for certain types of computations, such as those used in deep learning algorithms.

One of the key features of PICO-RAM is that it is designed to be insensitive to changes in the manufacturing process, power supply voltage, and temperature. This is important because these factors can often cause problems in electronic circuits, leading to unreliable or inconsistent behavior. By making PICO-RAM PVT-insensitive, the researchers have created a more robust and reliable memory system.

Another important aspect of PICO-RAM is its ability to perform multi-bit charge computing. This means that it can perform calculations on multiple bits of data at the same time, rather than having to process them one bit at a time. This can greatly improve the efficiency and speed of certain types of computations.

Finally, PICO-RAM uses a 6T (six-transistor) thin-cell layout, which allows for higher-density integration of the memory cells. This means that more memory can be packed into a smaller area, which is important for applications where space and power consumption are at a premium, such as in mobile devices or embedded systems.

Technical Explanation

The key innovations of PICO-RAM are:

  1. PVT-Insensitive Analog Compute-in-Memory Circuit: The researchers have developed a novel analog compute-in-memory circuit that is designed to be insensitive to changes in the manufacturing process, power supply voltage, and temperature. This is achieved through the use of a feedback-based charge injection scheme and a dynamic voltage biasing technique.

  2. In-Situ Multi-Bit Charge Computing: PICO-RAM supports in-situ multi-bit charge computing, which means that it can perform calculations on multiple bits of data simultaneously. This is achieved through the use of a charge-based computing scheme that allows for the parallel processing of multiple bits.

  3. 6T Thin-Cell-Compatible Layout: PICO-RAM uses a 6T thin-cell layout, which enables high-density integration of the memory cells. This is important for applications where space and power consumption are at a premium, as it allows for more memory to be packed into a smaller area.

The researchers have evaluated PICO-RAM through both circuit-level simulations and hardware experiments, demonstrating its effectiveness in terms of PVT-insensitivity, multi-bit charge computing, and density-efficient layout.

Critical Analysis

The researchers have addressed several important challenges in the design of PICO-RAM, such as PVT-insensitivity and multi-bit charge computing. However, the paper does not provide a detailed comparison of PICO-RAM's performance and energy efficiency with other state-of-the-art analog compute-in-memory architectures, such as those presented in the Analog or Digital Memory Computing paper. Additionally, the paper does not discuss the potential limitations of the 6T thin-cell layout, such as its impact on read/write performance or reliability.

Further research could explore the scalability of PICO-RAM, its integration with digital logic, and its applicability to a wider range of compute-in-memory workloads, such as those found in neuromorphic computing or in-memory processing systems.

Conclusion

The PICO-RAM SRAM macro presented in this paper represents a significant advancement in the field of analog compute-in-memory technologies. By addressing key challenges such as PVT-insensitivity, multi-bit charge computing, and density-efficient layout, PICO-RAM has the potential to enable more energy-efficient and high-performance in-memory computing for a wide range of applications, including deep learning, edge computing, and embedded systems.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

PICO-RAM: A PVT-Insensitive Analog Compute-In-Memory SRAM Macro with In-Situ Multi-Bit Charge Computing and 6T Thin-Cell-Compatible Layout
Total Score

0

PICO-RAM: A PVT-Insensitive Analog Compute-In-Memory SRAM Macro with In-Situ Multi-Bit Charge Computing and 6T Thin-Cell-Compatible Layout

Zhiyu Chen, Ziyuan Wen, Weier Wan, Akhil Reddy Pakala, Yiwei Zou, Wei-Chen Wei, Zengyi Li, Yubei Chen, Kaiyuan Yang

Analog compute-in-memory (CIM) in static random-access memory (SRAM) is promising for accelerating deep learning inference by circumventing the memory wall and exploiting ultra-efficient analog low-precision arithmetic. Latest analog CIM designs attempt bit-parallel schemes for multi-bit analog Matrix-Vector Multiplication (MVM), aiming at higher energy efficiency, throughput, and training simplicity and robustness over conventional bit-serial methods that digitally shift-and-add multiple partial analog computing results. However, bit-parallel operations require more complex analog computations and become more sensitive to well-known analog CIM challenges, including large cell areas, inefficient and inaccurate multi-bit analog operations, and vulnerability to PVT variations. This paper presents PICO-RAM, a PVT-insensitive and compact CIM SRAM macro with charge-domain bit-parallel computation. It adopts a multi-bit thin-cell Multiply-Accumulate (MAC) unit that shares the same transistor layout as the most compact 6T SRAM cell. All analog computing modules, including digital-to-analog converters (DACs), MAC units, analog shift-and-add, and analog-to-digital converters (ADCs) reuse one set of local capacitors inside the array, performing in-situ computation to save area and enhance accuracy. A compact 8.5-bit dual-threshold time-domain ADC power gates the main path most of the time, leading to a significant energy reduction. Our 65-nm prototype achieves the highest weight storage density of 559 Kb/mm${^2}$ and exceptional robustness to temperature and voltage variations (-40 to 105 $^{circ}$C and 0.65 to 1.2 V) among SRAM-based analog CIM designs.

Read more

7/19/2024

🌐

Total Score

0

A 65nm 8b-Activation 8b-Weight SRAM-Based Charge-Domain Computing-in-Memory Macro Using A Fully-Parallel Analog Adder Network and A Single-ADC Interface

Guodong Yin, Mufeng Zhou, Yiming Chen, Wenjun Tang, Zekun Yang, Mingyen Lee, Xirui Du, Jinshan Yue, Jiaxin Liu, Huazhong Yang, Yongpan Liu, Xueqing Li

Performing data-intensive tasks in the von Neumann architecture is challenging to achieve both high performance and power efficiency due to the memory wall bottleneck. Computing-in-memory (CiM) is a promising mitigation approach by enabling parallel in-situ multiply-accumulate (MAC) operations within the memory with support from the peripheral interface and datapath. SRAM-based charge-domain CiM (CD-CiM) has shown its potential of enhanced power efficiency and computing accuracy. However, existing SRAM-based CD-CiM faces scaling challenges to meet the throughput requirement of high-performance multi-bit-quantization applications. This paper presents an SRAM-based high-throughput ReLU-optimized CD-CiM macro. It is capable of completing MAC and ReLU of two signed 8b vectors in one CiM cycle with only one A/D conversion. Along with non-linearity compensation for the analog computing and A/D conversion interfaces, this work achieves 51.2GOPS throughput and 10.3TOPS/W energy efficiency, while showing 88.6% accuracy in the CIFAR-10 dataset.

Read more

4/3/2024

PACiM: A Sparsity-Centric Hybrid Compute-in-Memory Architecture via Probabilistic Approximation
Total Score

0

PACiM: A Sparsity-Centric Hybrid Compute-in-Memory Architecture via Probabilistic Approximation

Wenlun Zhang, Shimpei Ando, Yung-Chin Chen, Satomi Miyagi, Shinya Takamaeda-Yamazaki, Kentaro Yoshioka

Approximate computing emerges as a promising approach to enhance the efficiency of compute-in-memory (CiM) systems in deep neural network processing. However, traditional approximate techniques often significantly trade off accuracy for power efficiency, and fail to reduce data transfer between main memory and CiM banks, which dominates power consumption. This paper introduces a novel probabilistic approximate computation (PAC) method that leverages statistical techniques to approximate multiply-and-accumulation (MAC) operations, reducing approximation error by 4X compared to existing approaches. PAC enables efficient sparsity-based computation in CiM systems by simplifying complex MAC vector computations into scalar calculations. Moreover, PAC enables sparsity encoding and eliminates the LSB activations transmission, significantly reducing data reads and writes. This sets PAC apart from traditional approximate computing techniques, minimizing not only computation power but also memory accesses by 50%, thereby boosting system-level efficiency. We developed PACiM, a sparsity-centric architecture that fully exploits sparsity to reduce bit-serial cycles by 81% and achieves a peak 8b/8b efficiency of 14.63 TOPS/W in 65 nm CMOS while maintaining high accuracy of 93.85/72.36/66.02% on CIFAR-10/CIFAR-100/ImageNet benchmarks using a ResNet-18 model, demonstrating the effectiveness of our PAC methodology.

Read more

8/30/2024

STT-RAM-based Hierarchical In-Memory Computing
Total Score

0

STT-RAM-based Hierarchical In-Memory Computing

Dhruv Gajaria, Kevin Antony Gomez, Tosiron Adegbija

In-memory computing promises to overcome the von Neumann bottleneck in computer systems by performing computations directly within the memory. Previous research has suggested using Spin-Transfer Torque RAM (STT-RAM) for in-memory computing due to its non-volatility, low leakage power, high density, endurance, and commercial viability. This paper explores hierarchical in-memory computing, where different levels of the memory hierarchy are augmented with processing elements to optimize workload execution. The paper investigates processing in memory (PiM) using non-volatile STT-RAM and processing in cache (PiC) using volatile STT-RAM with relaxed retention, which helps mitigate STT-RAM's write latency and energy overheads. We analyze tradeoffs and overheads associated with data movement for PiC versus write overheads for PiM using STT-RAMs for various workloads. We examine workload characteristics, such as computational intensity and CPU-dependent workloads with limited instruction-level parallelism, and their impact on PiC/PiM tradeoffs. Using these workloads, we evaluate computing in STT-RAM versus SRAM at different cache hierarchy levels and explore the potential of heterogeneous STT-RAM cache architectures with various retention times for PiC and CPU-based computing. Our experiments reveal significant advantages of STT-RAM-based PiC over PiM for specific workloads. Finally, we describe open research problems in hierarchical in-memory computing architectures to further enhance this paradigm.

Read more

7/30/2024