Chiplet Placement Order Exploration Based on Learning to Rank with Graph Representation

2404.04943

Published 4/9/2024 by Zhihui Deng, Yuanyuan Duan, Leilai Shao, Xiaolei Zhu

🖼️

Abstract

Chiplet-based systems, integrating various silicon dies manufactured at different integrated circuit technology nodes on a carrier interposer, have garnered significant attention in recent years due to their cost-effectiveness and competitive performance. The widespread adoption of reinforcement learning as a sequential placement method has introduced a new challenge in determining the optimal placement order for each chiplet. The order in which chiplets are placed on the interposer influences the spatial resources available for earlier and later placed chiplets, making the placement results highly sensitive to the sequence of chiplet placement. To address these challenges, we propose a learning to rank approach with graph representation, building upon the reinforcement learning framework RLPlanner. This method aims to select the optimal chiplet placement order for each chiplet-based system. Experimental results demonstrate that compared to placement order obtained solely based on the descending order of the chiplet area and the number of interconnect wires between the chiplets, utilizing the placement order obtained from the learning to rank network leads to further improvements in system temperature and inter-chiplet wirelength. Specifically, applying the top-ranked placement order obtained from the learning to rank network results in a 10.05% reduction in total inter-chiplet wirelength and a 1.01% improvement in peak system temperature during the chiplet placement process.

Get summaries of the top AI research delivered straight to your inbox:

Overview

Chiplet-based systems integrate different silicon dies on a carrier interposer, offering cost-effectiveness and competitive performance.
The placement order of chiplets on the interposer is crucial, as it affects the spatial resources available for each chiplet.
The research proposes a learning to rank approach with graph representation to determine the optimal chiplet placement order.
Experiments show this method leads to improvements in system temperature and inter-chiplet wirelength compared to other placement strategies.

Plain English Explanation

Chiplet-based systems are a new way of building computer chips. Instead of having a single, large chip, these systems combine multiple smaller "chiplets" made using different manufacturing processes. This approach can be more cost-effective and offer better performance.

One key challenge with chiplet-based systems is determining the best order to place the chiplets on the carrier, called an "interposer." The order matters because it affects the space available for each chiplet. Earlier-placed chiplets have more flexibility, while later-placed ones are more constrained.

To address this, the researchers developed a learning-based approach that can recommend the optimal placement order for the chiplets. This method uses a "learning to rank" technique, which learns to identify the best order by analyzing the relationships between the chiplets.

When compared to other placement strategies, the researchers found that using their recommended order led to better results. Specifically, it reduced the total length of the connections between chiplets by 10.05% and improved the maximum temperature of the system by 1.01%. These improvements can have important benefits for the overall performance and efficiency of the chiplet-based system.

Technical Explanation

The researchers proposed a learning to rank approach with graph representation to determine the optimal chiplet placement order for chiplet-based systems. This builds upon the RLPlanner reinforcement learning framework.

The key idea is to model the chiplet placement problem as a graph, where the chiplets are nodes and the connections between them are edges. The learning to rank network then learns to predict the optimal order to place the chiplets on the interposer by analyzing the relationships in this graph.

Experiments were conducted to evaluate the proposed approach. The researchers compared the placement order obtained from their learning to rank network against two other strategies: placing chiplets in descending order of area, and placing them in descending order of the number of interconnect wires between them.

The results showed that using the placement order recommended by the learning to rank network led to a 10.05% reduction in total inter-chiplet wirelength and a 1.01% improvement in peak system temperature compared to the other placement strategies.

Critical Analysis

The paper presents a promising approach for determining the optimal chiplet placement order in chiplet-based systems. The use of a learning to rank technique with a graph representation is a novel and well-designed solution to this important challenge.

One potential limitation of the research is that it focuses only on optimizing for temperature and wirelength. There may be other important factors, such as power consumption or reliability, that could be considered in the placement optimization. Additionally, the paper does not provide much discussion of the computational complexity or runtime of the proposed algorithm, which could be an important practical consideration.

Another area for further research could be to explore the integration of this placement optimization approach with other chiplet-level design and optimization techniques. For example, it may be possible to jointly optimize the chiplet placement order alongside other design parameters, such as the assignment of functionality to each chiplet.

Overall, the research makes a valuable contribution to the field of chiplet-based system design and highlights the importance of optimizing the placement order for these emerging architectures.

Conclusion

This paper presents a novel learning to rank approach with graph representation to determine the optimal placement order of chiplets in chiplet-based systems. The proposed method outperforms other placement strategies, leading to significant reductions in inter-chiplet wirelength and improvements in peak system temperature.

The research highlights the critical role of placement order optimization in chiplet-based systems and demonstrates the effectiveness of using a learning-based approach to address this challenge. As chiplet-based architectures continue to gain traction, this work provides a valuable tool for system designers to optimize the performance and efficiency of these emerging computer systems.

Related Papers

FPGA Divide-and-Conquer Placement using Deep Reinforcement Learning

Shang Wang, Deepak Ranganatha Sastry Mamillapalli, Tianpei Yang, Matthew E. Taylor

This paper introduces the problem of learning to place logic blocks in Field-Programmable Gate Arrays (FPGAs) and a learning-based method. In contrast to previous search-based placement algorithms, we instead employ Reinforcement Learning (RL) with the goal of minimizing wirelength. In addition to our preliminary learning results, we also evaluated a novel decomposition to address the nature of large search space when placing many blocks on a chipboard. Empirical experiments evaluate the effectiveness of the learning and decomposition paradigms on FPGA placement tasks.

4/23/2024

cs.AR cs.AI cs.LG

DG-RePlAce: A Dataflow-Driven GPU-Accelerated Analytical Global Placement Framework for Machine Learning Accelerators

Andrew B. Kahng, Zhiang Wang

Global placement is a fundamental step in VLSI physical design. The wide use of 2D processing element (PE) arrays in machine learning accelerators poses new challenges of scalability and Quality of Results (QoR) for state-of-the-art academic global placers. In this work, we develop DG-RePlAce, a new and fast GPU-accelerated global placement framework built on top of the OpenROAD infrastructure, which exploits the inherent dataflow and datapath structures of machine learning accelerators. Experimental results with a variety of machine learning accelerators using a commercial 12nm enablement show that, compared with RePlAce (DREAMPlace), our approach achieves an average reduction in routed wirelength by 10% (7%) and total negative slack (TNS) by 31% (34%), with faster global placement and on-par total runtimes relative to DREAMPlace. Empirical studies on the TILOS MacroPlacement Benchmarks further demonstrate that post-route improvements over RePlAce and DREAMPlace may reach beyond the motivating application to machine learning accelerators.

4/23/2024

cs.AR cs.LG

Learning to rank quantum circuits for hardware-optimized performance enhancement

Gavin S. Hartnett, Aaron Barbosa, Pranav S. Mundada, Michael Hush, Michael J. Biercuk, Yuval Baum

We introduce and experimentally test a machine-learning-based method for ranking logically equivalent quantum circuits based on expected performance estimates derived from a training procedure conducted on real hardware. We apply our method to the problem of layout selection, in which abstracted qubits are assigned to physical qubits on a given device. Circuit measurements performed on IBM hardware indicate that the maximum and median fidelities of logically equivalent layouts can differ by an order of magnitude. We introduce a circuit score used for ranking that is parameterized in terms of a physics-based, phenomenological error model whose parameters are fit by training a ranking-loss function over a measured dataset. The dataset consists of quantum circuits exhibiting a diversity of structures and executed on IBM hardware, allowing the model to incorporate the contextual nature of real device noise and errors without the need to perform an exponentially costly tomographic protocol. We perform model training and execution on the 16-qubit ibmq_guadalupe device and compare our method to two common approaches: random layout selection and a publicly available baseline called Mapomatic. Our model consistently outperforms both approaches, predicting layouts that exhibit lower noise and higher performance. In particular, we find that our best model leads to a $1.8times$ reduction in selection error when compared to the baseline approach and a $3.2times$ reduction when compared to random selection. Beyond delivering a new form of predictive quantum characterization, verification, and validation, our results reveal the specific way in which context-dependent and coherent gate errors appear to dominate the divergence from performance estimates extrapolated from simple proxy measures.

4/11/2024

cs.LG

Constrained Object Placement Using Reinforcement Learning

Benedikt Kreis, Nils Dengler, Jorge de Heuvel, Rohit Menon, Hamsa Datta Perur, Maren Bennewitz

Close and precise placement of irregularly shaped objects requires a skilled robotic system. Particularly challenging is the manipulation of objects that have sensitive top surfaces and a fixed set of neighbors. To avoid damaging the surface, they have to be grasped from the side, and during placement, their neighbor relations have to be maintained. In this work, we train a reinforcement learning agent that generates smooth end-effector motions to place objects as close as possible next to each other. During the placement, our agent considers neighbor constraints defined in a given layout of the objects while trying to avoid collisions. Our approach learns to place compact object assemblies without the need for predefined spacing between objects as required by traditional methods. We thoroughly evaluated our approach using a two-finger gripper mounted to a robotic arm with six degrees of freedom. The results show that our agent outperforms two baseline approaches in terms of object assembly compactness, thereby reducing the needed space to place the objects according to the given neighbor constraints. On average, our approach reduces the distances between all placed objects by at least 60%, with fewer collisions at the same compactness compared to both baselines.

4/17/2024

cs.RO