Reconstruction of Unstable Heavy Particles Using Deep Symmetry-Preserving Attention Networks

Read original: arXiv:2309.01886 - Published 5/2/2024 by Michael James Fenton, Alexander Shmakov, Hideki Okawa, Yuji Li, Ko-Yang Hsiao, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

🤿

Overview

This paper presents an extended version of the Symmetry Preserving Attention Network (SPA-NET) architecture, which is used to reconstruct unstable heavy particles from detector data at the Large Hadron Collider (LHC).
The extended SPA-NET model can handle multiple input object types, such as leptons and global event features, in addition to hadronic jets.
The model provides both regression and classification outputs to improve the parton assignment process.
The performance of the extended SPA-NET is evaluated on semi-leptonic top quark pair decays and top quark pairs produced in association with a Higgs boson.

Plain English Explanation

At the Large Hadron Collider, physicists study the collisions of particles to learn about the fundamental building blocks of the universe. When certain heavy, unstable particles are produced in these collisions, they quickly decay into other particles that are detected by the experiment. Reconstructing the original heavy particles from the detected particles is a challenging task due to the large number of possible configurations.

The researchers in this paper have extended a machine learning model called the Symmetry Preserving Attention Network (SPA-NET) to better handle this reconstruction problem. The extended SPA-NET model can take into account different types of detected particles, such as leptons and global event features like missing energy, in addition to the more common hadronic jets. The model also provides both classification and regression outputs to help assign the detected particles to the original heavy particles.

The researchers tested the extended SPA-NET model on two specific physics processes: the decay of top quark pairs and the production of top quark pairs along with a Higgs boson. They found that the extended model significantly improved the performance of three representative studies: a search for the Higgs boson produced with top quarks, a measurement of the top quark mass, and a search for a hypothetical heavy Z' particle decaying into top quark pairs.

Technical Explanation

The Symmetry Preserving Attention Network (SPA-NET) is a machine learning architecture that has been previously applied to the problem of reconstructing top quark pair decays at the LHC, which produce only hadronic jets. In this work, the researchers extend the SPA-NET architecture to handle multiple input object types, such as leptons, as well as global event features, such as missing transverse momentum.

The extended SPA-NET model provides both regression and classification outputs to improve the parton assignment process. The regression outputs predict the four-momenta of the underlying partons, while the classification outputs determine the most likely assignment of detector objects to partons.

The researchers evaluate the performance of the extended SPA-NET in the context of semi-leptonic top quark pair decays and top quark pairs produced in association with a Higgs boson. They find significant improvements in the power of three representative studies: a search for ttH, a measurement of the top quark mass, and a search for a heavy Z' decaying to top quark pairs.

The researchers also present ablation studies to provide insight into what the network has learned in each case. These studies can help identify the key features and mechanisms that contribute to the improved performance of the extended SPA-NET model.

Critical Analysis

The paper presents a well-designed extension of the SPA-NET architecture to handle more complex particle physics processes at the LHC. The inclusion of leptons and global event features, as well as the addition of regression and classification outputs, is a clear advancement over the previous version of the model.

However, the paper does not address certain limitations of the approach. For example, the model still relies on a predefined set of possible parton assignments, which may not capture all the possible configurations in real-world data. Additionally, the paper does not discuss the computational efficiency of the extended SPA-NET model, which is an important consideration for real-time applications at the LHC.

Furthermore, the paper could have provided more insights into the specific features and mechanisms that the network has learned, beyond the high-level ablation studies. A deeper understanding of the network's inner workings could lead to further improvements or inspire the development of alternative approaches.

Overall, the paper presents a promising extension of the SPA-NET architecture, but there are opportunities for further research to address the limitations and provide a more comprehensive understanding of the model's performance.

Conclusion

This paper describes an extended version of the Symmetry Preserving Attention Network (SPA-NET) architecture, which is used to reconstruct unstable heavy particles from detector data at the Large Hadron Collider. The extended SPA-NET model can handle multiple input object types, such as leptons and global event features, and provides both regression and classification outputs to improve the parton assignment process.

The researchers found that the extended SPA-NET model significantly improved the performance of three representative studies: a search for the Higgs boson produced with top quarks, a measurement of the top quark mass, and a search for a hypothetical heavy Z' particle decaying into top quark pairs. While the paper presents a promising advancement, there are opportunities for further research to address the model's limitations and provide a deeper understanding of its inner workings.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤿

Reconstruction of Unstable Heavy Particles Using Deep Symmetry-Preserving Attention Networks

Michael James Fenton, Alexander Shmakov, Hideki Okawa, Yuji Li, Ko-Yang Hsiao, Shih-Chieh Hsu, Daniel Whiteson, Pierre Baldi

Reconstructing unstable heavy particles requires sophisticated techniques to sift through the large number of possible permutations for assignment of detector objects to the underlying partons. Anapproach based on a generalized attention mechanism, symmetry preserving attention networks (SPA-NET), has been previously applied to top quark pair decays at the Large Hadron Collider which produce only hadronic jets. Here we extend the SPA-NET architecture to consider multiple input object types, such as leptons, as well as global event features, such as the missing transverse momentum. Inaddition, we provide regression and classification outputs to supplement the parton assignment. We explore the performance of the extended capability of SPA-NET in the context of semi-leptonic decays of top quark pairs as well as top quark pairs produced in association with a Higgs boson. We find significant improvements in the power of three representative studies: a search for ttH, a measurement of the top quark mass, and a search for a heavy Z' decaying to top quark pairs. We present ablation studies to provide insight on what the network has learned in each case.

5/2/2024

Reconstructing Richtmyer-Meshkov instabilities from noisy radiographs using low dimensional features and attention-based neural networks

Daniel A. Serino, Marc L. Klasky, Balasubramanya T. Nadiga, Xiaojian Xu, Trevor Wilcox

A trained attention-based transformer network can robustly recover the complex topologies given by the Richtmyer-Meshkoff instability from a sequence of hydrodynamic features derived from radiographic images corrupted with blur, scatter, and noise. This approach is demonstrated on ICF-like double shell hydrodynamic simulations. The key component of this network is a transformer encoder that acts on a sequence of features extracted from noisy radiographs. This encoder includes numerous self-attention layers that act to learn temporal dependencies in the input sequences and increase the expressiveness of the model. This approach is demonstrated to exhibit an excellent ability to accurately recover the Richtmyer-Meshkov instability growth rates, even despite the gas-metal interface being greatly obscured by radiographic noise.

8/6/2024

Autoencoders for Real-Time SUEP Detection

Simranjit Singh Chhibra, Nadezda Chernyavskaya, Benedikt Maier, Maurzio Pierini, Syed Hasan

Confining dark sectors with pseudo-conformal dynamics can produce Soft Unclustered Energy Patterns (SUEP), at the Large Hadron Collider: the production of dark quarks in proton-proton collisions leading to a dark shower and the high-multiplicity production of dark hadrons. The final experimental signature is spherically-symmetric energy deposits by an anomalously large number of soft Standard Model particles with a transverse energy of O(100) MeV. Assuming Yukawa-like couplings of the scalar portal state, the dominant production mode is gluon fusion, and the dominant background comes from multi-jet QCD events. We have developed a deep learning-based Anomaly Detection technique to reject QCD jets and identify any anomalous signature, including SUEP, in real-time in the High-Level Trigger system of the Compact Muon Solenoid experiment at the Large Hadron Collider. A deep convolutional neural autoencoder network has been trained using QCD events by taking transverse energy deposits in the inner tracker, electromagnetic calorimeter, and hadron calorimeter sub-detectors as 3-channel image data. Due to the sparse nature of the data, only ~0.5% of the total ~300 k image pixels have non-zero values. To tackle this challenge, a non-standard loss function, the inverse of the so-called Dice Loss, is exploited. The trained autoencoder with learned spatial features of QCD jets can detect 40% of the SUEP events, with a QCD event mistagging rate as low as 2%. The model inference time has been measured using the Intel CoreTM i5-9600KF processor and found to be ~20 ms, which perfectly satisfies the High-Level Trigger system's latency of O(100) ms. Given the virtue of the unsupervised learning of the autoencoders, the trained model can be applied to any new physics model that predicts an experimental signature anomalous to QCD jets.

7/8/2024

Novel Approaches for ML-Assisted Particle Track Reconstruction and Hit Clustering

Uraz Odyurt, Nadezhda Dobreva, Zef Wolffs, Yue Zhao, Antonio Ferrer S'anchez, Roberto Ruiz de Austri Bazan, Jos'e D. Mart'in-Guerrero, Ana-Lucia Varbanescu, Sascha Caron

Track reconstruction is a vital aspect of High-Energy Physics (HEP) and plays a critical role in major experiments. In this study, we delve into unexplored avenues for particle track reconstruction and hit clustering. Firstly, we enhance the algorithmic design effort by utilising a simplified simulator (REDVID) to generate training data that is specifically composed for simplicity. We demonstrate the effectiveness of this data in guiding the development of optimal network architectures. Additionally, we investigate the application of image segmentation networks for this task, exploring their potential for accurate track reconstruction. Moreover, we approach the task from a different perspective by treating it as a hit sequence to track sequence translation problem. Specifically, we explore the utilisation of Transformer architectures for tracking purposes. Our preliminary findings are covered in detail. By considering this novel approach, we aim to uncover new insights and potential advancements in track reconstruction. This research sheds light on previously unexplored methods and provides valuable insights for the field of particle track reconstruction and hit clustering in HEP.

5/28/2024