Adoption of a token-based authentication model for the CMS Submission Infrastructure

Read original: arXiv:2405.14644 - Published 5/24/2024 by Antonio Perez-Calero Yzquierdo, Marco Mascheroni, Edita Kizinevic, Farrukh Aftab Khan, Hyunwoo Kim, Maria Acosta Flechas, Nikos Tsipinakis, Saqib Haleem, Frank Wurthwein
Total Score

0

📈

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • The CMS Submission Infrastructure (SI) is the main computing resource provisioning system for CMS workloads, using multiple HTCondor pools to manage geographically distributed resources.
  • Historically, the SI has relied on Grid Security Infrastructure (GSI) for authentication, but this is being replaced with modern standards like IDTokens and Scitokens.
  • The migration to token-based authentication is already well underway, with HTCondor and HTCondor-CE supporting token-based job submission.

Plain English Explanation

The CMS Submission Infrastructure (link) is the main system that provides computing resources for the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC). This infrastructure uses a collection of HTCondor pools to manage and coordinate the various computing resources, which come from different locations around the world.

In the past, this authentication between the different components of the SI was based on the Grid Security Infrastructure (GSI), which used digital certificates to verify identities. However, more modern authentication standards are now based on capabilities and tokens, rather than certificates. The Worldwide LHC Computing Grid (WLCG) has recognized this trend and is working to replace GSI across all of its workload management, data transfer, and storage access operations by the end of the current LHC data-taking period, known as Run 3.

As part of this effort, the CMS SI team is in the process of phasing out the use of GSI and transitioning to IDTokens and Scitokens for authentication. The HTCondor software suite already supports token-based authentication, which has allowed the SI team to fully migrate the authentication between its internal components. Additionally, recent versions of the HTCondor-CE (Computing Element) also support tokens, enabling CMS resource requests to Grid sites using this technology to be granted through token exchange.

The migration to token-based authentication is well underway, with a rollout campaign to sites that was successfully completed by the third quarter of 2022. As a result, all of the HTCondor CEs used by CMS are now receiving Scitoken-based pilot jobs. A parallel campaign has also been launched to foster the adoption of the REST interface at CMS sites, which is required to enable token-based job submission via HTCondor-G, and this is also nearing completion.

Technical Explanation

The CMS Submission Infrastructure (SI) is the main computing resource provisioning system for the CMS experiment at the LHC. It manages a number of HTCondor pools that aggregate geographically distributed resources from the WLCG and other providers.

Historically, the authentication model for this infrastructure has relied on the Grid Security Infrastructure (GSI), which is based on identities and X509 certificates. However, modern authentication standards are increasingly shifting towards capability-based and token-based approaches, such as IDTokens and Scitokens.

In response to this trend, the WLCG has identified the need to transparently replace GSI for all its workload management, data transfer, and storage access operations by the end of the current LHC Run 3. As part of this effort, the CMS SI group is working to phase out the GSI-based authentication layers in favor of token-based approaches.

The use of tokens is already well integrated into the HTCondor Software Suite, which has enabled the CMS SI team to fully migrate the authentication between the internal components of the infrastructure. Additionally, recent versions of the HTCondor-CE (Computing Element) also support tokens, allowing CMS resource requests to Grid sites using this technology to be granted through token exchange.

A rollout campaign to CMS sites was successfully completed by the third quarter of 2022, ensuring that all HTCondor CEs in use by CMS are now receiving Scitoken-based pilot jobs. In parallel, a campaign has been launched to foster the adoption of the REST interface at CMS sites, which is required to enable token-based job submission via HTCondor-G, and this is also nearing completion.

Critical Analysis

The paper provides a clear and detailed overview of the CMS Submission Infrastructure's migration from the Grid Security Infrastructure (GSI) to a token-based authentication model using IDTokens and Scitokens. The authors highlight the benefits of this transition, such as the increased support for token-based authentication in the HTCondor software suite and the HTCondor-CE, which enables more modern and flexible authentication approaches.

One potential area for further research or consideration could be the impact of this migration on the overall performance and reliability of the CMS SI, as well as any potential challenges or issues that may arise during the full rollout of the token-based authentication system across all CMS sites. Additionally, it would be interesting to see how the CMS SI's experience with this transition could inform or influence similar efforts in other large-scale scientific computing infrastructures, such as those used by the ATLAS or LHCb experiments at the LHC.

Conclusion

The CMS Submission Infrastructure is undergoing a significant transition in its authentication model, moving away from the Grid Security Infrastructure (GSI) and towards a token-based approach using IDTokens and Scitokens. This shift aligns with broader trends in the computing industry and the Worldwide LHC Computing Grid's (WLCG) strategic goals for the current LHC data-taking period.

The migration to token-based authentication is already well underway, with the HTCondor software suite and HTCondor-CE providing strong support for this new model. The CMS SI team has successfully completed a rollout campaign to CMS sites, ensuring that all HTCondor CEs are now receiving Scitoken-based pilot jobs. A parallel effort to foster the adoption of the REST interface, required for token-based job submission via HTCondor-G, is also nearing completion.

This transition represents an important step forward in the CMS Submission Infrastructure's ability to leverage modern authentication standards and integrate more seamlessly with the evolving computing landscape of the LHC experiments and the broader scientific community.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📈

Total Score

0

Adoption of a token-based authentication model for the CMS Submission Infrastructure

Antonio Perez-Calero Yzquierdo, Marco Mascheroni, Edita Kizinevic, Farrukh Aftab Khan, Hyunwoo Kim, Maria Acosta Flechas, Nikos Tsipinakis, Saqib Haleem, Frank Wurthwein

The CMS Submission Infrastructure (SI) is the main computing resource provisioning system for CMS workloads. A number of HTCondor pools are employed to manage this infrastructure, which aggregates geographically distributed resources from the WLCG and other providers. Historically, the model of authentication among the diverse components of this infrastructure has relied on the Grid Security Infrastructure (GSI), based on identities and X509 certificates. In contrast, commonly used modern authentication standards are based on capabilities and tokens. The WLCG has identified this trend and aims at a transparent replacement of GSI for all its workload management, data transfer and storage access operations, to be completed during the current LHC Run 3. As part of this effort, and within the context of CMS computing, the Submission Infrastructure group is in the process of phasing out the GSI part of its authentication layers, in favor of IDTokens and Scitokens. The use of tokens is already well integrated into the HTCondor Software Suite, which has allowed us to fully migrate the authentication between internal components of SI. Additionally, recent versions of the HTCondor-CE support tokens as well, enabling CMS resource requests to Grid sites employing this CE technology to be granted by means of token exchange. After a rollout campaign to sites, successfully completed by the third quarter of 2022, the totality of HTCondor CEs in use by CMS are already receiving Scitoken-based pilot jobs. On the ARC CE side, a parallel campaign was launched to foster the adoption of the REST interface at CMS sites (required to enable token-based job submission via HTCondor-G), which is nearing completion as well. In this contribution, the newly adopted authentication model will be described. We will then report on the migration status and final steps towards complete GSI phase out in the CMS SI.

Read more

5/24/2024

🤷

Total Score

0

The integration of heterogeneous resources in the CMS Submission Infrastructure for the LHC Run 3 and beyond

Antonio Perez-Calero Yzquierdo, Marco Mascheroni, Edita Kizinevic, Farrukh Aftab Khan, Hyunwoo Kim, Maria Acosta Flechas, Nikos Tsipinakis, Saqib Haleem

While the computing landscape supporting LHC experiments is currently dominated by x86 processors at WLCG sites, this configuration will evolve in the coming years. LHC collaborations will be increasingly employing HPC and Cloud facilities to process the vast amounts of data expected during the LHC Run 3 and the future HL-LHC phase. These facilities often feature diverse compute resources, including alternative CPU architectures like ARM and IBM Power, as well as a variety of GPU specifications. Using these heterogeneous resources efficiently is thus essential for the LHC collaborations reaching their future scientific goals. The Submission Infrastructure (SI) is a central element in CMS Computing, enabling resource acquisition and exploitation by CMS data processing, simulation and analysis tasks. The SI must therefore be adapted to ensure access and optimal utilization of this heterogeneous compute capacity. Some steps in this evolution have been already taken, as CMS is currently using opportunistically a small pool of GPU slots provided mainly at the CMS WLCG sites. Additionally, Power9 processors have been validated for CMS production at the Marconi-100 cluster at CINECA. This note will describe the updated capabilities of the SI to continue ensuring the efficient allocation and use of computing resources by CMS, despite their increasing diversity. The next steps towards a full integration and support of heterogeneous resources according to CMS needs will also be reported.

Read more

5/24/2024

📉

Total Score

0

HPC resources for CMS offline computing: An integration and scalability challenge for the Submission Infrastructure

Antonio Perez-Calero Yzquierdo, Marco Mascheroni, Edita Kizinevic, Farrukh Aftab Khan, Hyunwoo Kim, Maria Acosta Flechas, Nikos Tsipinakis, Saqib Haleem

The computing resource needs of LHC experiments are expected to continue growing significantly during the Run 3 and into the HL-LHC era. The landscape of available resources will also evolve, as High Performance Computing (HPC) and Cloud resources will provide a comparable, or even dominant, fraction of the total compute capacity. The future years present a challenge for the experiments' resource provisioning models, both in terms of scalability and increasing complexity. The CMS Submission Infrastructure (SI) provisions computing resources for CMS workflows. This infrastructure is built on a set of federated HTCondor pools, currently aggregating 400k CPU cores distributed worldwide and supporting the simultaneous execution of over 200k computing tasks. Incorporating HPC resources into CMS computing represents firstly an integration challenge, as HPC centers are much more diverse compared to Grid sites. Secondly, evolving the present SI, dimensioned to harness the current CMS computing capacity, to reach the resource scales required for the HLLHC phase, while maintaining global flexibility and efficiency, will represent an additional challenge for the SI. To preventively address future potential scalability limits, the SI team regularly runs tests to explore the maximum reach of our infrastructure. In this note, the integration of HPC resources into CMS offline computing is summarized, the potential concerns for the SI derived from the increased scale of operations are described, and the most recent results of scalability test on the CMS SI are reported.

Read more

5/24/2024

Modeling Distributed Computing Infrastructures for HEP Applications
Total Score

0

Modeling Distributed Computing Infrastructures for HEP Applications

Maximilian Horzela, Henri Casanova, Manuel Giffels, Artur Gottmann, Robin Hofsaess, Gunter Quast, Simone Rossi Tisbeni, Achim Streit, Fr'ed'eric Suter

Predicting the performance of various infrastructure design options in complex federated infrastructures with computing sites distributed over a wide area network that support a plethora of users and workflows, such as the Worldwide LHC Computing Grid (WLCG), is not trivial. Due to the complexity and size of these infrastructures, it is not feasible to deploy experimental test-beds at large scales merely for the purpose of comparing and evaluating alternate designs. An alternative is to study the behaviours of these systems using simulation. This approach has been used successfully in the past to identify efficient and practical infrastructure designs for High Energy Physics (HEP). A prominent example is the Monarc simulation framework, which was used to study the initial structure of the WLCG. New simulation capabilities are needed to simulate large-scale heterogeneous computing systems with complex networks, data access and caching patterns. A modern tool to simulate HEP workloads that execute on distributed computing infrastructures based on the SimGrid and WRENCH simulation frameworks is outlined. Studies of its accuracy and scalability are presented using HEP as a case-study. Hypothetical adjustments to prevailing computing architectures in HEP are studied providing insights into the dynamics of a part of the WLCG and candidates for improvements.

Read more

5/14/2024