Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis

Read original: arXiv:2405.06081 - Published 5/13/2024 by Ismail Emir Yuksel, Yahya Can Tugrul, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Ataberk Olgun, Melina Soysal, Haocong Luo, Juan G'omez-Luna, Mohammad Sadrosadati and 1 other
Total Score

4

💬

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • The researchers experimentally analyzed the computational capabilities of commercial off-the-shelf (COTS) DRAM chips and how robust these capabilities are under various conditions.
  • They extensively characterized 120 COTS DDR4 chips from two major manufacturers.
  • The key findings include the DRAM chips' ability to perform simultaneous many-row activation, majority (MAJ) operations, and multi-row copying.
  • The success rate of these operations can be improved by replicating input data across multiple rows, and the operations are resilient to changes in temperature and voltage.
  • The researchers believe these results demonstrate the potential of using DRAM as a computation substrate and have open-sourced their infrastructure to aid future research.

Plain English Explanation

The researchers looked at how DRAM chips, the memory components found in computers and other devices, can be used for more than just storing data. They found that DRAM chips can actually perform certain computational tasks, like simultaneously activating multiple rows of memory or copying data from one row to many others.

For example, the DRAM chips can perform a type of operation called a "majority" vote, where they look at three or more input values and output the value that appears most often. This could be useful for things like error-checking or decision-making in computing systems.

The researchers found that the DRAM chips' ability to perform these computational tasks is quite robust - it doesn't get thrown off much by changes in temperature, voltage, or the specific patterns of data being stored. In fact, they found that replicating the input data across multiple rows can actually improve the success rate of these computational operations.

Overall, the researchers believe their findings show the potential of using DRAM chips, which are already widely available and affordable, as a way to do basic computation alongside traditional memory functions. This could lead to more efficient and versatile computing systems in the future.

Technical Explanation

The researchers conducted an extensive characterization of 120 COTS DDR4 DRAM chips from two major manufacturers. They focused on evaluating the computational capabilities of these DRAM chips and how robust these capabilities are under various conditions, including timing delays between DRAM commands, data patterns, temperature, and voltage levels.

Their key findings include:

  1. Simultaneous Many-Row Activation: The DRAM chips are capable of simultaneously activating up to 32 rows. This ability, known as simultaneous many-row activation, enables parallel processing of data across multiple DRAM rows.

  2. Majority (MAJ) Operations: The DRAM chips can execute a majority of X (MAJX) operation, where X is greater than 3 (e.g., MAJ5, MAJ7, MAJ9). These majority operations can be useful for error-checking and decision-making.

  3. Multi-RowCopy: The DRAM chips can copy a DRAM row concurrently to up to 31 other DRAM rows, a capability the researchers call Multi-RowCopy.

The researchers also found that replicating the input operands across all simultaneously activated rows can significantly improve the success rate of the MAJX operations. For example, the success rate of MAJ3 with 32-row activation (i.e., replicating each MAJ3 input 10 times) is 30.81% higher than MAJ3 with 4-row activation (no replication).

Furthermore, the researchers observed that data patterns affect the success rate of MAJX and Multi-RowCopy operations, with variations of up to 11.52% and 0.07%, respectively.

Importantly, the researchers found that simultaneous many-row activation, MAJX, and Multi-RowCopy operations are highly resilient to temperature and voltage changes, with success rate variations of at most 2.13% across all tested operations.

Critical Analysis

The researchers provide a comprehensive and rigorous analysis of the computational capabilities of COTS DRAM chips. Their findings demonstrate the potential of using DRAM as a computation substrate, beyond its traditional role as a memory component.

One potential limitation of the study is the specific set of DRAM chips tested (120 DDR4 chips from two manufacturers). While this sample size is substantial, it may not fully capture the diversity of DRAM chips available on the market. Expanding the analysis to include DRAM chips from additional manufacturers and different generations could provide further insights.

Additionally, the researchers focused on evaluating the success rates of specific computational operations, such as majority voting and multi-row copying. While these operations are interesting and potentially useful, there may be other computational tasks or workloads that DRAM chips could be well-suited for, which were not explored in this study.

Future research could investigate the integration of DRAM-based computation with traditional computing architectures, exploring the potential synergies and trade-offs between memory and computation. Additionally, simulation frameworks could be developed to further explore the design space and potential applications of DRAM-based computation.

Conclusion

The researchers' findings demonstrate the impressive computational capabilities of COTS DRAM chips, which go beyond their traditional role as memory components. The ability to perform simultaneous many-row activation, majority operations, and multi-row copying, with high resilience to environmental changes, suggests that DRAM could be leveraged as a computation substrate.

These results open up new possibilities for more efficient and versatile computing systems, where DRAM chips can contribute to both data storage and basic computation. By open-sourcing their infrastructure, the researchers have made it easier for others to build upon this work and further explore the potential of DRAM-based computational models.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

💬

Total Score

4

Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis

Ismail Emir Yuksel, Yahya Can Tugrul, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Ataberk Olgun, Melina Soysal, Haocong Luo, Juan G'omez-Luna, Mohammad Sadrosadati, Onur Mutlu

We experimentally analyze the computational capability of commercial off-the-shelf (COTS) DRAM chips and the robustness of these capabilities under various timing delays between DRAM commands, data patterns, temperature, and voltage levels. We extensively characterize 120 COTS DDR4 chips from two major manufacturers. We highlight four key results of our study. First, COTS DRAM chips are capable of 1) simultaneously activating up to 32 rows (i.e., simultaneous many-row activation), 2) executing a majority of X (MAJX) operation where X>3 (i.e., MAJ5, MAJ7, and MAJ9 operations), and 3) copying a DRAM row (concurrently) to up to 31 other DRAM rows, which we call Multi-RowCopy. Second, storing multiple copies of MAJX's input operands on all simultaneously activated rows drastically increases the success rate (i.e., the percentage of DRAM cells that correctly perform the computation) of the MAJX operation. For example, MAJ3 with 32-row activation (i.e., replicating each MAJ3's input operands 10 times) has a 30.81% higher average success rate than MAJ3 with 4-row activation (i.e., no replication). Third, data pattern affects the success rate of MAJX and Multi-RowCopy operations by 11.52% and 0.07% on average. Fourth, simultaneous many-row activation, MAJX, and Multi-RowCopy operations are highly resilient to temperature and voltage changes, with small success rate variations of at most 2.13% among all tested operations. We believe these empirical results demonstrate the promising potential of using DRAM as a computation substrate. To aid future research and development, we open-source our infrastructure at https://github.com/CMU-SAFARI/SiMRA-DRAM.

Read more

5/13/2024

🎲

Total Score

1

Functionally-Complete Boolean Logic in Real DRAM Chips: Experimental Characterization and Analysis

Ismail Emir Yuksel, Yahya Can Tugrul, Ataberk Olgun, F. Nisa Bostanci, A. Giray Yaglikci, Geraldo F. Oliveira, Haocong Luo, Juan G'omez-Luna, Mohammad Sadrosadati, Onur Mutlu

Processing-using-DRAM (PuD) is an emerging paradigm that leverages the analog operational properties of DRAM circuitry to enable massively parallel in-DRAM computation. PuD has the potential to reduce or eliminate costly data movement between processing elements and main memory. Prior works experimentally demonstrate three-input MAJ (MAJ3) and two-input AND and OR operations in commercial off-the-shelf (COTS) DRAM chips. Yet, demonstrations on COTS DRAM chips do not provide a functionally complete set of operations. We experimentally demonstrate that COTS DRAM chips are capable of performing 1) functionally-complete Boolean operations: NOT, NAND, and NOR and 2) many-input (i.e., more than two-input) AND and OR operations. We present an extensive characterization of new bulk bitwise operations in 256 off-the-shelf modern DDR4 DRAM chips. We evaluate the reliability of these operations using a metric called success rate: the fraction of correctly performed bitwise operations. Among our 19 new observations, we highlight four major results. First, we can perform the NOT operation on COTS DRAM chips with a 98.37% success rate on average. Second, we can perform up to 16-input NAND, NOR, AND, and OR operations on COTS DRAM chips with high reliability (e.g., 16-input NAND, NOR, AND, and OR with an average success rate of 94.94%, 95.87%, 94.94%, and 95.85%, respectively). Third, data pattern only slightly affects bitwise operations. Our results show that executing NAND, NOR, AND, and OR operations with random data patterns decreases the success rate compared to all logic-1/logic-0 patterns by 1.39%, 1.97%, 1.43%, and 1.98%, respectively. Fourth, bitwise operations are highly resilient to temperature changes, with small success rate fluctuations of at most 1.66% when the temperature is increased from 50C to 95C. We open-source our infrastructure at https://github.com/CMU-SAFARI/FCDRAM

Read more

4/23/2024

🌐

Total Score

0

A 65nm 8b-Activation 8b-Weight SRAM-Based Charge-Domain Computing-in-Memory Macro Using A Fully-Parallel Analog Adder Network and A Single-ADC Interface

Guodong Yin, Mufeng Zhou, Yiming Chen, Wenjun Tang, Zekun Yang, Mingyen Lee, Xirui Du, Jinshan Yue, Jiaxin Liu, Huazhong Yang, Yongpan Liu, Xueqing Li

Performing data-intensive tasks in the von Neumann architecture is challenging to achieve both high performance and power efficiency due to the memory wall bottleneck. Computing-in-memory (CiM) is a promising mitigation approach by enabling parallel in-situ multiply-accumulate (MAC) operations within the memory with support from the peripheral interface and datapath. SRAM-based charge-domain CiM (CD-CiM) has shown its potential of enhanced power efficiency and computing accuracy. However, existing SRAM-based CD-CiM faces scaling challenges to meet the throughput requirement of high-performance multi-bit-quantization applications. This paper presents an SRAM-based high-throughput ReLU-optimized CD-CiM macro. It is capable of completing MAC and ReLU of two signed 8b vectors in one CiM cycle with only one A/D conversion. Along with non-linearity compensation for the analog computing and A/D conversion interfaces, this work achieves 51.2GOPS throughput and 10.3TOPS/W energy efficiency, while showing 88.6% accuracy in the CIFAR-10 dataset.

Read more

4/3/2024

Multicore DRAM Bank-& Row-Conflict Bomb for Timing Attacks in Mixed-Criticality Systems
Total Score

0

Multicore DRAM Bank-& Row-Conflict Bomb for Timing Attacks in Mixed-Criticality Systems

Antonio Savino, Gautam Gala, Marcello Cinque, Gerhard Fohler

With the increasing use of multicore platforms to realize mixed-criticality systems, understanding the underlying shared resources, such as the memory hierarchy shared among cores, and achieving isolation between co-executing tasks running on the same platform with different criticality levels becomes relevant. In addition to safety considerations, a malicious entity can exploit shared resources to create timing attacks on critical applications. In this paper, we focus on understanding the shared DRAM dual in-line memory module and created a timing attack, that we named the bank & row conflict bomb, to target a victim task in a multicore platform. We also created a navigate algorithm to understand how victim requests are managed by the Memory Controller and provide valuable inputs for designing the bank & row conflict bomb. We performed experimental tests on a 2nd Gen Intel Xeon Processor with an 8GB DDR4-2666 DRAM module to show that such an attack can produce a significant increase in the execution time of the victim task by about 150%, motivating the need for proper countermeasures to help ensure the safety and security of critical applications.

Read more

4/3/2024