Experimental demonstration of magnetic tunnel junction-based computational random-access memory

2312.14264

Published 4/8/2024 by Yang Lv, Brandon R. Zink, Robert P. Bloom, Husrev C{i}lasun, Pravin Khanal, Salonik Resch, Zamshed Chowdhury, Ali Habiboglu, Weigang Wang, Sachin S. Sapatnekar and 2 others

cs.ET cs.AI cs.AR cs.SY eess.SY

🏷️

Abstract

Conventional computing paradigm struggles to fulfill the rapidly growing demands from emerging applications, especially those for machine intelligence, because much of the power and energy is consumed by constant data transfers between logic and memory modules. A new paradigm, called computational random-access memory (CRAM) has emerged to address this fundamental limitation. CRAM performs logic operations directly using the memory cells themselves, without having the data ever leave the memory. The energy and performance benefits of CRAM for both conventional and emerging applications have been well established by prior numerical studies. However, there lacks an experimental demonstration and study of CRAM to evaluate its computation accuracy, which is a realistic and application-critical metrics for its technological feasibility and competitiveness. In this work, a CRAM array based on magnetic tunnel junctions (MTJs) is experimentally demonstrated. First, basic memory operations as well as 2-, 3-, and 5-input logic operations are studied. Then, a 1-bit full adder with two different designs is demonstrated. Based on the experimental results, a suite of modeling has been developed to characterize the accuracy of CRAM computation. Further analysis of scalar addition, multiplication, and matrix multiplication shows promising results. These results are then applied to a complete application: a neural network based handwritten digit classifier, as an example to show the connection between the application performance and further MTJ development. The classifier achieved almost-perfect classification accuracy, with reasonable projections of future MTJ development. With the confirmation of MTJ-based CRAM's accuracy, there is a strong case that this technology will have a significant impact on power- and energy-demanding applications of machine intelligence.

Get summaries of the top AI research delivered straight to your inbox:

Overview

Conventional computing struggles to keep up with growing demands, especially for machine intelligence, due to high power consumption from data transfers between memory and logic.
A new paradigm called computational random-access memory (CRAM) performs logic operations directly in memory, avoiding data transfers.
Prior studies have shown CRAM's energy and performance benefits, but its computation accuracy, a critical metric, lacks experimental demonstration.

Plain English Explanation

The paper describes a new approach called computational random-access memory (CRAM) that addresses a fundamental limitation of conventional computing. In traditional computers, a lot of power and energy is consumed by constantly moving data between the memory and the logic parts of the system. This is a problem for emerging applications like machine intelligence, which have rapidly growing demands.

CRAM aims to solve this by performing logic operations directly within the memory cells themselves, without ever having to move the data out of the memory. Prior studies have shown that this can provide significant energy and performance benefits. However, one key question that hasn't been experimentally demonstrated is how accurate the computations performed by CRAM can be.

The researchers in this paper have built a CRAM system using magnetic tunnel junctions (MTJs) and tested it to evaluate its computational accuracy. They look at basic memory operations as well as more complex logic operations and arithmetic, like full adders. The results show that CRAM can achieve accurate computations, which is an important step in demonstrating its viability for real-world applications.

Technical Explanation

The researchers experimentally demonstrated a CRAM array based on magnetic tunnel junctions (MTJs). They first tested basic memory operations as well as 2-input, 3-input, and 5-input logic operations. They then implemented a 1-bit full adder in two different designs.

Using the experimental results, the team developed models to characterize the accuracy of CRAM computations. Further analysis looked at scalar addition, multiplication, and matrix multiplication, showing promising results. The researchers then applied these findings to a neural network based handwritten digit classifier, as an example of how CRAM could be used in a real-world machine intelligence application.

The classifier achieved nearly perfect classification accuracy, indicating that the current state of MTJ-based CRAM technology is sufficient for such applications. The researchers also provide projections of how future improvements in MTJ technology could further enhance the capabilities of CRAM.

Critical Analysis

The paper provides a compelling experimental demonstration of the computational accuracy of CRAM, which is a crucial step in validating this technology. By testing basic logic operations as well as more complex arithmetic, the researchers have shown that CRAM can achieve the level of accuracy required for real-world applications.

However, the paper does not delve into the potential limitations or caveats of CRAM. For example, it would be interesting to understand the scaling and energy efficiency of CRAM compared to traditional von Neumann architectures, especially for large-scale computations. Prospects for non-linear memristors as alternative memory technologies could also be discussed.

Additionally, the paper focuses on a single application - a neural network for handwritten digit classification. It would be valuable to explore the potential of CRAM in a broader range of applications, such as memory retrieval-augmented neural networks or neuromorphic associative memory, to fully assess its capabilities and limitations.

Conclusion

This paper presents an important experimental demonstration of the computational accuracy of CRAM, a new paradigm that performs logic operations directly within memory. The results show that CRAM can achieve high levels of accuracy for a range of basic and more complex computations, which is a crucial step in establishing its feasibility and competitiveness for power-hungry machine intelligence applications.

With the confirmation of CRAM's accuracy, the researchers argue that this technology has the potential to have a significant impact on the field, especially as further improvements in MTJ technology reduce joule losses and enhance its capabilities. This work represents an important advancement in the quest for more energy-efficient computing architectures to meet the growing demands of emerging applications.

Related Papers

🌐

A 65nm 8b-Activation 8b-Weight SRAM-Based Charge-Domain Computing-in-Memory Macro Using A Fully-Parallel Analog Adder Network and A Single-ADC Interface

Guodong Yin, Mufeng Zhou, Yiming Chen, Wenjun Tang, Zekun Yang, Mingyen Lee, Xirui Du, Jinshan Yue, Jiaxin Liu, Huazhong Yang, Yongpan Liu, Xueqing Li

Performing data-intensive tasks in the von Neumann architecture is challenging to achieve both high performance and power efficiency due to the memory wall bottleneck. Computing-in-memory (CiM) is a promising mitigation approach by enabling parallel in-situ multiply-accumulate (MAC) operations within the memory with support from the peripheral interface and datapath. SRAM-based charge-domain CiM (CD-CiM) has shown its potential of enhanced power efficiency and computing accuracy. However, existing SRAM-based CD-CiM faces scaling challenges to meet the throughput requirement of high-performance multi-bit-quantization applications. This paper presents an SRAM-based high-throughput ReLU-optimized CD-CiM macro. It is capable of completing MAC and ReLU of two signed 8b vectors in one CiM cycle with only one A/D conversion. Along with non-linearity compensation for the analog computing and A/D conversion interfaces, this work achieves 51.2GOPS throughput and 10.3TOPS/W energy efficiency, while showing 88.6% accuracy in the CIFAR-10 dataset.

4/3/2024

cs.AR cs.LG

💬

RAM: Towards an Ever-Improving Memory System by Learning from Communications

Jiaqi Li, Xiaobo Wang, Zihao Wang, Zilong Zheng

We introduce RAM, an innovative RAG-based framework with an ever-improving memory. Inspired by humans' pedagogical process, RAM utilizes recursively reasoning-based retrieval and experience reflections to continually update the memory and learn from users' communicative feedback, namely communicative learning. Extensive experiments with both simulated and real users demonstrate significant improvements over traditional RAG and self-knowledge methods, particularly excelling in handling false premise and multi-hop questions. Furthermore, RAM exhibits promising adaptability to various feedback and retrieval method chain types, showcasing its potential for advancing AI capabilities in dynamic knowledge acquisition and lifelong learning.

4/22/2024

cs.AI cs.CL

💬

Spintronic memristors for computing

Qiming Shao, Zhongrui Wang, Yan Zhou, Shunsuke Fukami, Damien Querlioz, Yiran Chen, Leon O. Chua

The ever-increasing amount of data from ubiquitous smart devices fosters data-centric and cognitive algorithms. Traditional digital computer systems have separate logic and memory units, resulting in a huge delay and energy cost for implementing these algorithms. Memristors are programmable resistors with a memory, providing a paradigm-shifting approach towards creating intelligent hardware systems to handle data-centric tasks. Spintronic nanodevices are promising choices as they are high-speed, low-power, highly scalable, robust, and capable of constructing dynamic complex systems. In this Review, we survey spintronic devices from a memristor point of view. We introduce spintronic memristors based on magnetic tunnel junctions, nanomagnet ensemble, domain walls, topological spin textures, and spin waves, which represent dramatically different state spaces. They can exhibit steady, oscillatory, stochastic, and chaotic trajectories in their state spaces, which have been exploited for in-memory logic, neuromorphic computing, stochastic and chaos computing. Finally, we discuss challenges and trends in realizing large-scale spintronic memristive systems for practical applications.

4/23/2024

cs.ET

🔍

Building time-surfaces by exploiting the complex volatility of an ECRAM memristor

Marco Rasetto, Qingzhou Wan, Himanshu Akolkar, Feng Xiong, Bertram Shi, Ryad Benosman

Memristors have emerged as a promising technology for efficient neuromorphic architectures owing to their ability to act as programmable synapses, combining processing and memory into a single device. Although they are most commonly used for static encoding of synaptic weights, recent work has begun to investigate the use of their dynamical properties, such as Short Term Plasticity (STP), to integrate events over time in event-based architectures. However, we are still far from completely understanding the range of possible behaviors and how they might be exploited in neuromorphic computation. This work focuses on a newly developed Li$_textbf{x}$WO$_textbf{3}$-based three-terminal memristor that exhibits tunable STP and a conductance response modeled by a double exponential decay. We derive a stochastic model of the device from experimental data and investigate how device stochasticity, STP, and the double exponential decay affect accuracy in a hierarchy of time-surfaces (HOTS) architecture. We found that the device's stochasticity does not affect accuracy, that STP can reduce the effect of salt and pepper noise in signals from event-based sensors, and that the double exponential decay improves accuracy by integrating temporal information over multiple time scales. Our approach can be generalized to study other memristive devices to build a better understanding of how control over temporal dynamics can enable neuromorphic engineers to fine-tune devices and architectures to fit their problems at hand.

4/16/2024

cs.ET