Multicore DRAM Bank-& Row-Conflict Bomb for Timing Attacks in Mixed-Criticality Systems

2404.01910

Published 4/3/2024 by Antonio Savino, Gautam Gala, Marcello Cinque, Gerhard Fohler

Multicore DRAM Bank-& Row-Conflict Bomb for Timing Attacks in Mixed-Criticality Systems

Abstract

With the increasing use of multicore platforms to realize mixed-criticality systems, understanding the underlying shared resources, such as the memory hierarchy shared among cores, and achieving isolation between co-executing tasks running on the same platform with different criticality levels becomes relevant. In addition to safety considerations, a malicious entity can exploit shared resources to create timing attacks on critical applications. In this paper, we focus on understanding the shared DRAM dual in-line memory module and created a timing attack, that we named the bank & row conflict bomb, to target a victim task in a multicore platform. We also created a navigate algorithm to understand how victim requests are managed by the Memory Controller and provide valuable inputs for designing the bank & row conflict bomb. We performed experimental tests on a 2nd Gen Intel Xeon Processor with an 8GB DDR4-2666 DRAM module to show that such an attack can produce a significant increase in the execution time of the victim task by about 150%, motivating the need for proper countermeasures to help ensure the safety and security of critical applications.

Get summaries of the top AI research delivered straight to your inbox:

Overview

This paper explores a timing attack that exploits conflicts in DRAM bank and row accesses on multicore systems to disrupt the timing of mixed-criticality systems.
The attack works by intentionally causing DRAM bank and row conflicts between a malicious process and a victim process, leading to significant performance degradation for the victim.
The paper demonstrates the feasibility of this attack and analyzes its potential impact on mixed-criticality systems, which rely on strict timing guarantees.

Plain English Explanation

The research paper describes a new type of timing attack that can be used to disrupt the performance of mixed-criticality computer systems. These are systems where some critical tasks, like controlling an aircraft, need to run very quickly and predictably, while other less important tasks can run more slowly.

The attack works by exploiting how multicore computer chips access the shared DRAM memory. When multiple cores try to access the same DRAM bank or row at the same time, it can cause significant delays. The researchers show how a malicious process can intentionally create these DRAM conflicts with a victim process, causing the victim's performance to suffer dramatically.

This is a serious threat to mixed-criticality systems, where the timing guarantees of the critical tasks are essential for safety and reliability. By disrupting these timing guarantees, the attack could potentially compromise the entire system.

Technical Explanation

The paper first provides background on multicore DRAM architecture and timing side-channels. It then describes the key aspects of the proposed attack:

Exploit DRAM Bank and Row Conflicts: The attack leverages the fact that concurrent DRAM accesses to the same bank or row by different cores can cause significant performance degradation due to bank/row conflicts.
Synchronize Malicious and Victim Accesses: The malicious process synchronizes its DRAM accesses to intentionally overlap with the victim's accesses, leading to frequent bank/row conflicts.
Amplify the Attack Impact: The malicious process further amplifies the attack impact by triggering many conflicting DRAM accesses in rapid succession, creating a "DRAM bank and row conflict bomb."

The paper demonstrates the feasibility of this attack through experiments on a real multicore system. It shows that the attack can lead to over 50% performance degradation for the victim process, violating the timing guarantees required by mixed-criticality systems.

Critical Analysis

The paper provides a thorough analysis of the attack and its potential impact on mixed-criticality systems. However, it does not explore potential mitigations or defenses against this type of attack. Future research could investigate hardware or software-based techniques to detect and prevent such DRAM-based timing attacks.

Additionally, the paper focuses on a specific attack scenario involving a malicious process and a victim process. Further research could examine the attack's applicability to more complex mixed-criticality system architectures and workloads.

Conclusion

This research paper presents a novel timing attack that exploits DRAM bank and row conflicts on multicore systems to disrupt the timing guarantees of mixed-criticality systems. The attack demonstrates the potential for severe performance degradation and the need for robust security measures in these critical systems. The findings highlight the importance of continued research into hardware and software defenses against timing-based attacks to ensure the safety and reliability of mixed-criticality systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

💬

Simultaneous Many-Row Activation in Off-the-Shelf DRAM Chips: Experimental Characterization and Analysis

Ismail Emir Yuksel, Yahya Can Tugrul, F. Nisa Bostanci, Geraldo F. Oliveira, A. Giray Yaglikci, Ataberk Olgun, Melina Soysal, Haocong Luo, Juan G'omez-Luna, Mohammad Sadrosadati, Onur Mutlu

We experimentally analyze the computational capability of commercial off-the-shelf (COTS) DRAM chips and the robustness of these capabilities under various timing delays between DRAM commands, data patterns, temperature, and voltage levels. We extensively characterize 120 COTS DDR4 chips from two major manufacturers. We highlight four key results of our study. First, COTS DRAM chips are capable of 1) simultaneously activating up to 32 rows (i.e., simultaneous many-row activation), 2) executing a majority of X (MAJX) operation where X>3 (i.e., MAJ5, MAJ7, and MAJ9 operations), and 3) copying a DRAM row (concurrently) to up to 31 other DRAM rows, which we call Multi-RowCopy. Second, storing multiple copies of MAJX's input operands on all simultaneously activated rows drastically increases the success rate (i.e., the percentage of DRAM cells that correctly perform the computation) of the MAJX operation. For example, MAJ3 with 32-row activation (i.e., replicating each MAJ3's input operands 10 times) has a 30.81% higher average success rate than MAJ3 with 4-row activation (i.e., no replication). Third, data pattern affects the success rate of MAJX and Multi-RowCopy operations by 11.52% and 0.07% on average. Fourth, simultaneous many-row activation, MAJX, and Multi-RowCopy operations are highly resilient to temperature and voltage changes, with small success rate variations of at most 2.13% among all tested operations. We believe these empirical results demonstrate the promising potential of using DRAM as a computation substrate. To aid future research and development, we open-source our infrastructure at https://github.com/CMU-SAFARI/SiMRA-DRAM.

5/13/2024

cs.AR cs.DC

🌐

A 65nm 8b-Activation 8b-Weight SRAM-Based Charge-Domain Computing-in-Memory Macro Using A Fully-Parallel Analog Adder Network and A Single-ADC Interface

Guodong Yin, Mufeng Zhou, Yiming Chen, Wenjun Tang, Zekun Yang, Mingyen Lee, Xirui Du, Jinshan Yue, Jiaxin Liu, Huazhong Yang, Yongpan Liu, Xueqing Li

Performing data-intensive tasks in the von Neumann architecture is challenging to achieve both high performance and power efficiency due to the memory wall bottleneck. Computing-in-memory (CiM) is a promising mitigation approach by enabling parallel in-situ multiply-accumulate (MAC) operations within the memory with support from the peripheral interface and datapath. SRAM-based charge-domain CiM (CD-CiM) has shown its potential of enhanced power efficiency and computing accuracy. However, existing SRAM-based CD-CiM faces scaling challenges to meet the throughput requirement of high-performance multi-bit-quantization applications. This paper presents an SRAM-based high-throughput ReLU-optimized CD-CiM macro. It is capable of completing MAC and ReLU of two signed 8b vectors in one CiM cycle with only one A/D conversion. Along with non-linearity compensation for the analog computing and A/D conversion interfaces, this work achieves 51.2GOPS throughput and 10.3TOPS/W energy efficiency, while showing 88.6% accuracy in the CIFAR-10 dataset.

4/3/2024

cs.AR cs.LG

Practical Persistent Multi-Word Compare-and-Swap Algorithms for Many-Core CPUs

Kento Sugiura, Manabu Nishimura, Yoshiharu Ishikawa

In the last decade, academic and industrial researchers have focused on persistent memory because of the development of the first practical product, Intel Optane. One of the main challenges of persistent memory programming is to guarantee consistent durability over separate memory addresses, and Wang et al. proposed a persistent multi-word compare-and-swap (PMwCAS) algorithm to solve this problem. However, their algorithm contains redundant compare-and-swap (CAS) and cache flush instructions and does not achieve sufficient performance on many-core CPUs. This paper proposes a new algorithm to improve performance on many-core CPUs by removing useless CAS/flush instructions from PMwCAS operations. We also exclude dirty flags, which help ensure consistent durability in the original algorithm, from our algorithm using PMwCAS descriptors as write-ahead logs. Experimental results show that the proposed method is up to ten times faster than the original algorithm and suggests several productive uses of PMwCAS operations.

4/3/2024

cs.DB

🏷️

Experimental demonstration of magnetic tunnel junction-based computational random-access memory

Yang Lv, Brandon R. Zink, Robert P. Bloom, Husrev C{i}lasun, Pravin Khanal, Salonik Resch, Zamshed Chowdhury, Ali Habiboglu, Weigang Wang, Sachin S. Sapatnekar, Ulya Karpuzcu, Jian-Ping Wang

Conventional computing paradigm struggles to fulfill the rapidly growing demands from emerging applications, especially those for machine intelligence, because much of the power and energy is consumed by constant data transfers between logic and memory modules. A new paradigm, called computational random-access memory (CRAM) has emerged to address this fundamental limitation. CRAM performs logic operations directly using the memory cells themselves, without having the data ever leave the memory. The energy and performance benefits of CRAM for both conventional and emerging applications have been well established by prior numerical studies. However, there lacks an experimental demonstration and study of CRAM to evaluate its computation accuracy, which is a realistic and application-critical metrics for its technological feasibility and competitiveness. In this work, a CRAM array based on magnetic tunnel junctions (MTJs) is experimentally demonstrated. First, basic memory operations as well as 2-, 3-, and 5-input logic operations are studied. Then, a 1-bit full adder with two different designs is demonstrated. Based on the experimental results, a suite of modeling has been developed to characterize the accuracy of CRAM computation. Further analysis of scalar addition, multiplication, and matrix multiplication shows promising results. These results are then applied to a complete application: a neural network based handwritten digit classifier, as an example to show the connection between the application performance and further MTJ development. The classifier achieved almost-perfect classification accuracy, with reasonable projections of future MTJ development. With the confirmation of MTJ-based CRAM's accuracy, there is a strong case that this technology will have a significant impact on power- and energy-demanding applications of machine intelligence.

4/8/2024

cs.ET cs.AI cs.AR cs.SY eess.SY