Parallel performance of shared memory parallel spectral deferred corrections

Read original: arXiv:2403.20135 - Published 8/6/2024 by Philip Freese, Sebastian Gotschel, Thibaut Lunet, Daniel Ruprecht, Martin Schreiber

🚀

This paper investigates the parallel performance of parallel spectral deferred corrections (PSDC), a numerical approach that provides small-scale parallelism for solving initial value problems. The PSDC scheme is applied to the shallow water equation and uses an IMEX splitting that integrates fast modes implicitly and slow modes explicitly to be efficient.

The paper describes parallel OpenMP-based implementations of PSDC in two simulation codes: the finite volume based operational ocean model ICON-O and the spherical harmonics based research code SWEET. The implementations are benchmarked on single nodes of the JUSUF (SWEET) and JUWELS (ICON-O) systems at the Jülich Supercomputing Centre.

The results demonstrate a reduction in time-to-solution across a range of accuracies. For ICON-O, the PSDC implementation shows speedup over the currently used Adams-Bashforth-2 integrator with OpenMP loop parallelization. For SWEET, the PSDC implementation shows speedup over serial spectral deferred corrections and a second-order implicit-explicit integrator.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🚀

Parallel performance of shared memory parallel spectral deferred corrections

Philip Freese, Sebastian Gotschel, Thibaut Lunet, Daniel Ruprecht, Martin Schreiber

We investigate parallel performance of parallel spectral deferred corrections, a numerical approach that provides small-scale parallelism for the numerical solution of initial value problems. The scheme is applied to the shallow water equation and uses an IMEX splitting that integrates fast modes implicitly and slow modes explicitly in order to be efficient. We describe parallel $texttt{OpenMP}$-based implementations of parallel SDC in two well established simulation codes: the finite volume based operational ocean model $texttt{ICON-O}$ and the spherical harmonics based research code $texttt{SWEET}$. The implementations are benchmarked on a single node of the JUSUF ($texttt{SWEET}$) and JUWELS ($texttt{ICON-O}$) system at Julich Supercomputing Centre. We demonstrate a reduction of time-to-solution across a range of accuracies. For $texttt{ICON-O}$, we show speedup over the currently used Adams--Bashforth-2 integrator with $texttt{OpenMP}$ loop parallelization. For $texttt{SWEET}$, we show speedup over serial spectral deferred corrections and a second order implicit-explicit integrator.

8/6/2024

Adaptive time step selection for Spectral Deferred Correction

Thomas Baumann, Sebastian Gotschel, Thibaut Lunet, Daniel Ruprecht, Robert Speck

Spectral Deferred Correction (SDC) is an iterative method for the numerical solution of ordinary differential equations. It works by refining the numerical solution for an initial value problem by approximately solving differential equations for the error, and can be interpreted as a preconditioned fixed-point iteration for solving the fully implicit collocation problem. We adopt techniques from embedded Runge-Kutta Methods (RKM) to SDC in order to provide a mechanism for adaptive time step size selection and thus increase computational efficiency of SDC. We propose two SDC-specific estimates of the local error that are generic and do not rely on problem specific quantities. We demonstrate a gain in efficiency over standard SDC with fixed step size and compare efficiency favorably against state-of-the-art adaptive RKM.

9/5/2024

A GPU-ready pseudo-spectral method for direct numerical simulations of multiphase turbulence

Alessio Roccon

In this work, we detail the GPU-porting of an in-house pseudo-spectral solver tailored towards large-scale simulations of interface-resolved simulation of drop- and bubble-laden turbulent flows. The code relies on direct numerical simulation of the Navier-Stokes equations, used to describe the flow field, coupled with a phase-field method, used to describe the shape, deformation, and topological changes of the interface of the drops or bubbles. The governing equations -Navier-Stokes and Cahn-Hilliard equations-are solved using a pseudo-spectral method that relies on transforming the variables in the wavenumber space. The code targets large-scale simulations of drop- and bubble-laden turbulent flows and relies on a multilevel parallelism. The first level of parallelism relies on the message-passing interface (MPI) and is used on multi-core architectures in CPU-based infrastructures. A second level of parallelism relies on OpenACC directives and cuFFT libraries and is used to accelerate the code execution when GPU-based infrastructures are targeted. The resulting multiphase flow solver can be efficiently executed in heterogeneous computing infrastructures and exhibits a remarkable speed-up when GPUs are employed. Thanks to the modular structure of the code and the use of a directive-based strategy to offload code execution on GPUs, only minor code modifications are required when targeting different computing architectures. This improves code maintenance, version control and the implementation of additional modules or governing equations.

6/4/2024

Unlocking massively parallel spectral proper orthogonal decompositions in the PySPOD package

Marcin Rogowski, Brandon C. Y. Yeung, Oliver T. Schmidt, Romit Maulik, Lisandro Dalcin, Matteo Parsani, Gianmarco Mengaldo

We propose a parallel (distributed) version of the spectral proper orthogonal decomposition (SPOD) technique. The parallel SPOD algorithm distributes the spatial dimension of the dataset preserving time. This approach is adopted to preserve the non-distributed fast Fourier transform of the data in time, thereby avoiding the associated bottlenecks. The parallel SPOD algorithm is implemented in the PySPOD (https://github.com/MathEXLab/PySPOD) library and makes use of the standard message passing interface (MPI) library, implemented in Python via mpi4py (https://mpi4py.readthedocs.io/en/stable/). An extensive performance evaluation of the parallel package is provided, including strong and weak scalability analyses. The open-source library allows the analysis of large datasets of interest across the scientific community. Here, we present applications in fluid dynamics and geophysics, that are extremely difficult (if not impossible) to achieve without a parallel algorithm. This work opens the path toward modal analyses of big quasi-stationary data, helping to uncover new unexplored spatio-temporal patterns.

8/1/2024