The future of cosmological likelihood-based inference: accelerated high-dimensional parameter estimation and model comparison

2405.12965

Published 5/22/2024 by Davide Piras, Alicja Polanska, Alessio Spurio Mancini, Matthew A. Price, Jason D. McEwen

📈

Abstract

We advocate for a new paradigm of cosmological likelihood-based inference, leveraging recent developments in machine learning and its underlying technology, to accelerate Bayesian inference in high-dimensional settings. Specifically, we combine (i) emulation, where a machine learning model is trained to mimic cosmological observables, e.g. CosmoPower-JAX; (ii) differentiable and probabilistic programming, e.g. JAX and NumPyro, respectively; (iii) scalable Markov chain Monte Carlo (MCMC) sampling techniques that exploit gradients, e.g. Hamiltonian Monte Carlo; and (iv) decoupled and scalable Bayesian model selection techniques that compute the Bayesian evidence purely from posterior samples, e.g. the learned harmonic mean implemented in harmonic. This paradigm allows us to carry out a complete Bayesian analysis, including both parameter estimation and model selection, in a fraction of the time of traditional approaches. First, we demonstrate the application of this paradigm on a simulated cosmic shear analysis for a Stage IV survey in 37- and 39-dimensional parameter spaces, comparing $Lambda$CDM and a dynamical dark energy model ($w_0w_a$CDM). We recover posterior contours and evidence estimates that are in excellent agreement with those computed by the traditional nested sampling approach while reducing the computational cost from 8 months on 48 CPU cores to 2 days on 12 GPUs. Second, we consider a joint analysis between three simulated next-generation surveys, each performing a 3x2pt analysis, resulting in 157- and 159-dimensional parameter spaces. Standard nested sampling techniques are simply not feasible in this high-dimensional setting, requiring a projected 12 years of compute time on 48 CPU cores; on the other hand, the proposed approach only requires 8 days of compute time on 24 GPUs. All packages used in our analyses are publicly available.

Create account to get full access

Overview

The paper advocates for a new approach to cosmological Bayesian inference, leveraging recent advancements in machine learning and related technologies.
The proposed paradigm combines emulation, differentiable and probabilistic programming, scalable Markov chain Monte Carlo (MCMC) sampling, and decoupled Bayesian model selection techniques.
This approach aims to significantly accelerate Bayesian inference in high-dimensional settings compared to traditional methods.

Plain English Explanation

The paper introduces a new way to analyze data from cosmological surveys, which are large-scale observations of the universe. Traditionally, this analysis has been computationally intensive and time-consuming. However, the authors propose a novel approach that combines several recent developments in machine learning and related fields to make the process much faster.

At the core of this approach is the idea of emulation, where a machine learning model is trained to mimic the behavior of complex cosmological simulations. This allows the researchers to quickly generate predictions for different scenarios, without having to run the full simulations. They also leverage differentiable programming and probabilistic programming to make the inference process more efficient.

Additionally, the authors use advanced sampling techniques that can exploit gradient information to explore the high-dimensional parameter space more effectively. They also employ a Bayesian model selection approach that can compute the relative probabilities of different cosmological models using only the posterior samples, without the need for traditional nested sampling.

By combining these elements, the researchers demonstrate that they can perform a complete Bayesian analysis, including both parameter estimation and model selection, much faster than traditional methods. This has important implications for the field of cosmology, as it allows researchers to explore more complex models and make more detailed inferences about the nature of the universe.

Technical Explanation

The paper introduces a new paradigm for cosmological Bayesian inference that leverages recent advancements in machine learning and related technologies. The key elements of this approach are:

Emulation: The authors use a machine learning model, such as CosmoPower-JAX, to emulate the behavior of complex cosmological simulations, allowing for fast predictions of observables.
Differentiable and probabilistic programming: The use of frameworks like JAX and NumPyro enables the authors to perform differentiable and probabilistic programming, which streamlines the inference process.
Scalable MCMC sampling: The authors employ Hamiltonian Monte Carlo techniques that can efficiently explore high-dimensional parameter spaces by exploiting gradient information.
Decoupled Bayesian model selection: The authors use a learned harmonic mean approach to compute the Bayesian evidence for model selection, without the need for traditional nested sampling.

The paper demonstrates the application of this paradigm on two cosmological analyses: a simulated cosmic shear analysis for a Stage IV survey and a joint analysis between three simulated next-generation surveys. The authors show that their approach can recover posterior contours and evidence estimates that are in excellent agreement with the traditional nested sampling approach, while significantly reducing the computational cost.

Critical Analysis

The paper presents a compelling and well-designed approach to accelerating Bayesian inference in high-dimensional cosmological settings. The authors have carefully integrated several state-of-the-art techniques from machine learning and computational statistics to achieve this goal.

One potential limitation of the approach is the reliance on emulation models, which may not always capture the full complexity of the underlying cosmological simulations. While the authors demonstrate the effectiveness of their approach, it would be valuable to understand the impact of emulator errors on the final parameter estimates and model selection results.

Additionally, the paper focuses on simulated data, and it would be interesting to see how the proposed paradigm performs on real-world cosmological datasets. The authors acknowledge this and suggest that further validation on actual survey data is an important area for future research.

Another concern is the potential for overfitting in the Bayesian model selection process, as the learned harmonic mean approach may be sensitive to the specific characteristics of the posterior samples. The authors could explore ways to further validate the robustness of their model selection results.

Overall, the paper presents a significant advancement in the field of cosmological Bayesian inference and encourages readers to think critically about the potential benefits and limitations of this new paradigm. The open-source availability of the software packages used in the analyses is also commendable and will likely foster further development and adoption of these techniques.

Conclusion

The paper introduces a novel paradigm for cosmological Bayesian inference that leverages recent advancements in machine learning and computational statistics. By combining emulation, differentiable and probabilistic programming, scalable MCMC sampling, and decoupled Bayesian model selection, the authors demonstrate a significant acceleration in the analysis of high-dimensional cosmological datasets compared to traditional methods.

This work has important implications for the field of cosmology, as it enables researchers to explore more complex models and make more detailed inferences about the nature of the universe in a fraction of the time required by traditional approaches. The open-source availability of the software packages used in the analyses further enhances the potential impact of this research, as it facilitates the adoption and continued development of these techniques by the broader scientific community.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🤿

Deep Learning and genetic algorithms for cosmological Bayesian inference speed-up

Isidro G'omez-Vargas, J. Alberto V'azquez

In this paper, we present a novel approach to accelerate the Bayesian inference process, focusing specifically on the nested sampling algorithms. Bayesian inference plays a crucial role in cosmological parameter estimation, providing a robust framework for extracting theoretical insights from observational data. However, its computational demands can be substantial, primarily due to the need for numerous likelihood function evaluations. Our proposed method utilizes the power of deep learning, employing feedforward neural networks to approximate the likelihood function dynamically during the Bayesian inference process. Unlike traditional approaches, our method trains neural networks on-the-fly using the current set of live points as training data, without the need for pre-training. This flexibility enables adaptation to various theoretical models and datasets. We perform simple hyperparameter optimization using genetic algorithms to suggest initial neural network architectures for learning each likelihood function. Once sufficient accuracy is achieved, the neural network replaces the original likelihood function. The implementation integrates with nested sampling algorithms and has been thoroughly evaluated using both simple cosmological dark energy models and diverse observational datasets. Additionally, we explore the potential of genetic algorithms for generating initial live points within nested sampling inference, opening up new avenues for enhancing the efficiency and effectiveness of Bayesian inference methods.

5/7/2024

cs.LG cs.NE stat.ML

Fast Inference Using Automatic Differentiation and Neural Transport in Astroparticle Physics

Dorian W. P. Amaral, Shixiao Liang, Juehang Qin, Christopher Tunnell

Multi-dimensional parameter spaces are commonly encountered in astroparticle physics theories that attempt to capture novel phenomena. However, they often possess complicated posterior geometries that are expensive to traverse using techniques traditional to this community. Effectively sampling these spaces is crucial to bridge the gap between experiment and theory. Several recent innovations, which are only beginning to make their way into this field, have made navigating such complex posteriors possible. These include GPU acceleration, automatic differentiation, and neural-network-guided reparameterization. We apply these advancements to astroparticle physics experimental results in the context of novel neutrino physics and benchmark their performances against traditional nested sampling techniques. Compared to nested sampling alone, we find that these techniques increase performance for both nested sampling and Hamiltonian Monte Carlo, accelerating inference by factors of $sim 100$ and $sim 60$, respectively. As nested sampling also evaluates the Bayesian evidence, these advancements can be exploited to improve model comparison performance while retaining compatibility with existing implementations that are widely used in the natural sciences.

5/27/2024

cs.LG stat.ML

🤯

Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks

Javier Antoran

Large neural networks trained on large datasets have become the dominant paradigm in machine learning. These systems rely on maximum likelihood point estimates of their parameters, precluding them from expressing model uncertainty. This may result in overconfident predictions and it prevents the use of deep learning models for sequential decision making. This thesis develops scalable methods to equip neural networks with model uncertainty. In particular, we leverage the linearised Laplace approximation to equip pre-trained neural networks with the uncertainty estimates provided by their tangent linear models. This turns the problem of Bayesian inference in neural networks into one of Bayesian inference in conjugate Gaussian-linear models. Alas, the cost of this remains cubic in either the number of network parameters or in the number of observations times output dimensions. By assumption, neither are tractable. We address this intractability by using stochastic gradient descent (SGD) -- the workhorse algorithm of deep learning -- to perform posterior sampling in linear models and their convex duals: Gaussian processes. With this, we turn back to linearised neural networks, finding the linearised Laplace approximation to present a number of incompatibilities with modern deep learning practices -- namely, stochastic optimisation, early stopping and normalisation layers -- when used for hyperparameter learning. We resolve these and construct a sample-based EM algorithm for scalable hyperparameter learning with linearised neural networks. We apply the above methods to perform linearised neural network inference with ResNet-50 (25M parameters) trained on Imagenet (1.2M observations and 1000 output dimensions). Additionally, we apply our methods to estimate uncertainty for 3d tomographic reconstructions obtained with the deep image prior network.

5/1/2024

stat.ML cs.LG

🤯

Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo

Nayantara Mudur, Carolina Cuesta-Lazaro, Douglas P. Finkbeiner

Diffusion generative models have excelled at diverse image generation and reconstruction tasks across fields. A less explored avenue is their application to discriminative tasks involving regression or classification problems. The cornerstone of modern cosmology is the ability to generate predictions for observed astrophysical fields from theory and constrain physical models from observations using these predictions. This work uses a single diffusion generative model to address these interlinked objectives -- as a surrogate model or emulator for cold dark matter density fields conditional on input cosmological parameters, and as a parameter inference model that solves the inverse problem of constraining the cosmological parameters of an input field. The model is able to emulate fields with summary statistics consistent with those of the simulated target distribution. We then leverage the approximate likelihood of the diffusion generative model to derive tight constraints on cosmology by using the Hamiltonian Monte Carlo method to sample the posterior on cosmological parameters for a given test image. Finally, we demonstrate that this parameter inference approach is more robust to the addition of noise than baseline parameter inference networks.

5/9/2024

cs.LG