On Error Correction for Nonvolatile Processing-In-Memory

2207.13261

Published 4/30/2024 by Husrev C{i}lasun, Salonik Resch, Zamshed I. Chowdhury, Masoud Zabihi, Yang Lv, Brandon Zink, Jian-Ping Wang, Sachin S. Sapatnekar, Ulya R. Karpuzcu

cs.ET

🔄

Abstract

Processing in memory (PiM) represents a promising computing paradigm to enhance performance of numerous data-intensive applications. Variants performing computing directly in emerging nonvolatile memories can deliver very high energy efficiency. PiM architectures directly inherit the vulnerabilities of the underlying memory substrates, but they also are subject to errors due to the computation in place. Numerous well-established error correcting codes (ECC) for memory exist, and are also considered in the PiM context, however, they typically ignore errors that occur throughout computation. In this paper we revisit the error correction design space for nonvolatile PiM, considering both storage/memory and computation-induced errors, surveying several self-checking and homomorphic approaches. We propose several solutions and analyze their complex performance-area-coverage trade-off, using three representative nonvolatile PiM technologies. All of these solutions guarantee single error correction for both, bulk bitwise computations and ordinary memory/storage errors.

Get summaries of the top AI research delivered straight to your inbox:

Overview

Processing in Memory (PiM) is a promising computing paradigm that can enhance the performance of data-intensive applications.
PiM architectures can deliver very high energy efficiency by performing computing directly in emerging nonvolatile memories.
However, PiM architectures inherit the vulnerabilities of the underlying memory substrates and are also subject to errors due to the computation in place.
Error correcting codes (ECC) for memory are well-established, but they typically ignore errors that occur throughout computation.

Plain English Explanation

In traditional computer systems, data is stored in memory and then moved to a separate processing unit for computation. Processing in Memory (PiM) is a new approach that aims to improve efficiency by performing computations directly within the memory itself. This can significantly reduce the time and energy required to move data back and forth.

PiM architectures use emerging nonvolatile memory technologies, such as phase-change memory or memristors, to achieve very high energy efficiency. However, these memory technologies can also be more prone to errors, both in the storage of data and during the computation process. Existing error correcting codes (ECC) designed for memory systems may not be sufficient to handle the additional errors introduced by in-place computation.

This paper explores new approaches to error correction for PiM systems, considering both the errors that can occur in the memory storage and the errors that can occur during the computation itself. The researchers propose several solutions and analyze the tradeoffs in terms of performance, area, and the level of error coverage they provide.

Technical Explanation

The paper revisits the error correction design space for nonvolatile PiM, considering both storage/memory and computation-induced errors. The researchers survey several self-checking and homomorphic approaches to address these challenges.

The proposed solutions guarantee single error correction for both bulk bitwise computations and ordinary memory/storage errors. The researchers analyze the complex performance-area-coverage tradeoffs of these solutions using three representative nonvolatile PiM technologies: phase-change memory (PCM), resistive random-access memory (RRAM), and spin-transfer torque magnetic RAM (STT-MRAM).

The paper explores techniques for building robust in-memory computing systems that can withstand the various types of errors that can occur, including those introduced by the computation process itself. This is an important consideration as PiM architectures become more prevalent in data-intensive applications.

Critical Analysis

The paper addresses an important challenge in the development of PiM architectures, namely, the need to ensure reliable computation in the face of various error sources. The proposed solutions demonstrate the researchers' efforts to tackle this problem comprehensively, considering both memory storage and computation-induced errors.

However, the paper does not provide a detailed evaluation of the real-world performance and practicality of these solutions. Further research may be needed to assess their feasibility and scalability in actual PiM systems. Additionally, the paper does not explore the potential impact of these error correction techniques on the overall system performance and energy efficiency, which are critical factors in the adoption of PiM architectures.

Nonetheless, the paper provides a valuable contribution to the field by highlighting the importance of robust error correction in PiM and proposing several promising approaches to address this challenge. As PiM continues to evolve, addressing the reliability and resilience of these systems will be crucial for their widespread adoption in data-intensive applications.

Conclusion

This paper explores the critical challenge of ensuring reliable computation in Processing in Memory (PiM) architectures. The researchers propose several solutions that incorporate self-checking and homomorphic techniques to address both storage/memory and computation-induced errors. These solutions aim to provide comprehensive error correction for PiM systems, which is essential as these architectures become more prevalent in data-intensive applications.

While further research is needed to fully evaluate the practicality and performance of these solutions, this paper represents an important step forward in the development of robust and reliable PiM systems. As the field of in-memory computing continues to advance, addressing the reliability and resilience of these systems will be a key priority for researchers and engineers working to unlock the full potential of PiM.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Analysis of Distributed Optimization Algorithms on a Real Processing-In-Memory System

Steve Rhyner, Haocong Luo, Juan G'omez-Luna, Mohammad Sadrosadati, Jiawei Jiang, Ataberk Olgun, Harshita Gupta, Ce Zhang, Onur Mutlu

Machine Learning (ML) training on large-scale datasets is a very expensive and time-consuming workload. Processor-centric architectures (e.g., CPU, GPU) commonly used for modern ML training workloads are limited by the data movement bottleneck, i.e., due to repeatedly accessing the training dataset. As a result, processor-centric systems suffer from performance degradation and high energy consumption. Processing-In-Memory (PIM) is a promising solution to alleviate the data movement bottleneck by placing the computation mechanisms inside or near memory. Our goal is to understand the capabilities and characteristics of popular distributed optimization algorithms on real-world PIM architectures to accelerate data-intensive ML training workloads. To this end, we 1) implement several representative centralized distributed optimization algorithms on UPMEM's real-world general-purpose PIM system, 2) rigorously evaluate these algorithms for ML training on large-scale datasets in terms of performance, accuracy, and scalability, 3) compare to conventional CPU and GPU baselines, and 4) discuss implications for future PIM hardware and the need to shift to an algorithm-hardware codesign perspective to accommodate decentralized distributed optimization algorithms. Our results demonstrate three major findings: 1) Modern general-purpose PIM architectures can be a viable alternative to state-of-the-art CPUs and GPUs for many memory-bound ML training workloads, when operations and datatypes are natively supported by PIM hardware, 2) the importance of carefully choosing the optimization algorithm that best fit PIM, and 3) contrary to popular belief, contemporary PIM architectures do not scale approximately linearly with the number of nodes for many data-intensive ML training workloads. To facilitate future research, we aim to open-source our complete codebase.

4/11/2024

cs.AR cs.AI cs.DC cs.LG

Intrinsic Voltage Offsets in Memcapacitive Bio-Membranes Enable High-Performance Physical Reservoir Computing

Ahmed S. Mohamed, Anurag Dhungel, Md Sakib Hasan, Joseph S. Najem

Reservoir computing is a brain-inspired machine learning framework for processing temporal data by mapping inputs into high-dimensional spaces. Physical reservoir computers (PRCs) leverage native fading memory and nonlinearity in physical substrates, including atomic switches, photonics, volatile memristors, and, recently, memcapacitors, to achieve efficient high-dimensional mapping. Traditional PRCs often consist of homogeneous device arrays, which rely on input encoding methods and large stochastic device-to-device variations for increased nonlinearity and high-dimensional mapping. These approaches incur high pre-processing costs and restrict real-time deployment. Here, we introduce a novel heterogeneous memcapacitor-based PRC that exploits internal voltage offsets to enable both monotonic and non-monotonic input-state correlations crucial for efficient high-dimensional transformations. We demonstrate our approach's efficacy by predicting a second-order nonlinear dynamical system with an extremely low prediction error (0.00018). Additionally, we predict a chaotic H'enon map, achieving a low normalized root mean square error (0.080). Unlike previous PRCs, such errors are achieved without input encoding methods, underscoring the power of distinct input-state correlations. Most importantly, we generalize our approach to other neuromorphic devices that lack inherent voltage offsets using externally applied offsets to realize various input-state correlations. Our approach and unprecedented performance are a major milestone towards high-performance full in-materia PRCs.

5/16/2024

cs.ET cs.AI cs.LG

🔍

Building time-surfaces by exploiting the complex volatility of an ECRAM memristor

Marco Rasetto, Qingzhou Wan, Himanshu Akolkar, Feng Xiong, Bertram Shi, Ryad Benosman

Memristors have emerged as a promising technology for efficient neuromorphic architectures owing to their ability to act as programmable synapses, combining processing and memory into a single device. Although they are most commonly used for static encoding of synaptic weights, recent work has begun to investigate the use of their dynamical properties, such as Short Term Plasticity (STP), to integrate events over time in event-based architectures. However, we are still far from completely understanding the range of possible behaviors and how they might be exploited in neuromorphic computation. This work focuses on a newly developed Li$_textbf{x}$WO$_textbf{3}$-based three-terminal memristor that exhibits tunable STP and a conductance response modeled by a double exponential decay. We derive a stochastic model of the device from experimental data and investigate how device stochasticity, STP, and the double exponential decay affect accuracy in a hierarchy of time-surfaces (HOTS) architecture. We found that the device's stochasticity does not affect accuracy, that STP can reduce the effect of salt and pepper noise in signals from event-based sensors, and that the double exponential decay improves accuracy by integrating temporal information over multiple time scales. Our approach can be generalized to study other memristive devices to build a better understanding of how control over temporal dynamics can enable neuromorphic engineers to fine-tune devices and architectures to fit their problems at hand.

4/16/2024

cs.ET

🏅

Investigating impact of bit-flip errors in control electronics on quantum computation

Subrata Das, Avimita Chatterjee, Swaroop Ghosh

In this paper, we investigate the impact of bit flip errors in FPGA memories in control electronics on quantum computing systems. FPGA memories are integral in storing the amplitude and phase information pulse envelopes, which are essential for generating quantum gate pulses. However, these memories can incur faults due to physical and environmental stressors such as electromagnetic interference, power fluctuations, and temperature variations and adversarial fault injections, potentially leading to errors in quantum gate operations. To understand how these faults affect quantum computations, we conducted a series of experiments to introduce bit flips into the amplitude (both real and imaginary components) and phase values of quantum pulses using IBM's simulated quan- tum environments, FakeValencia, FakeManila, and FakeLima. Our findings reveal that bit flips in the exponent and initial mantissa bits of the real amplitude cause substantial deviations in quantum gate operations, with TVD increases as high as ~200%. Interestingly, the remaining bits exhibited natural tolerance to errors. We proposed a 3-bit repetition error correction code, which effectively reduced the TVD increases to below 40% without incurring any memory overhead. Due to reuse of less significant bits for error correction, the proposed approach introduces maximum of 5-7% extra TVD in nominal cases. However, this can be avoided by sacrificing memory area for implementing the repetition code.

5/10/2024

cs.ET