Bayesian RG Flow in Neural Network Field Theories

Read original: arXiv:2405.17538 - Published 5/29/2024 by Jessica N. Howard, Marc S. Klinger, Anindita Maiti, Alexander G. Stapleton

Bayesian RG Flow in Neural Network Field Theories

Overview

This paper explores the use of Bayesian Renormalization Group (RG) flow in the context of neural network field theories.
The researchers propose a novel approach for learning the RG flow in a Bayesian framework, which allows for uncertainty quantification and better generalization.
The method is demonstrated on various model systems, including the Ising model and a neural network field theory.

Plain English Explanation

The paper looks at a mathematical technique called the Renormalization Group (RG) flow, which is used to study how systems change at different scales. The researchers apply this technique to neural networks, which are a type of machine learning model inspired by the brain.

Typically, RG flow is used in physics to understand how the properties of a system, like a magnet or a liquid, change as you zoom in or out. In this paper, the researchers use a Bayesian approach to learn the RG flow for neural networks. This means they can not only find the most likely flow, but also quantify the uncertainty in their results.

By using this Bayesian RG flow, the researchers show they can better generalize their neural network models to new situations, rather than just memorizing the training data. They demonstrate this on a couple of different example problems, like modeling a simple magnet-like system (the Ising model) and a more complex neural network field theory.

Technical Explanation

The paper introduces a Bayesian approach to learning the Renormalization Group (RG) flow for neural network field theories. The key idea is to treat the RG flow as a Bayesian inference problem, where the goal is to infer the underlying RG flow given observed data.

The authors propose a neural network architecture that can parameterize the RG flow, and they develop a variational inference framework to learn this flow in a Bayesian manner. This allows for uncertainty quantification in the learned RG flow, which the authors show can lead to better generalization performance on various model systems, including the Ising model and a neural network field theory.

The Bayesian approach to RG flow learning introduces several advantages over traditional deterministic methods, such as the ability to handle uncertainty and better generalize to new data distributions.

Critical Analysis

The paper presents a novel and promising approach to learning RG flows in a Bayesian framework, with potential applications in physics, machine learning, and other areas where RG techniques are used. However, the authors acknowledge several limitations and areas for future research.

One key limitation is the computational complexity of the variational inference procedure, which may limit the scalability of the method to larger systems. The authors suggest exploring more efficient inference techniques as a possible direction for future work.

Additionally, the authors note that the performance of the Bayesian RG flow method is sensitive to the choice of neural network architecture and hyperparameters. Further research may be needed to develop more robust and generalizable approaches.

Finally, the authors do not provide a comprehensive comparison to other Bayesian or non-Bayesian RG flow methods, which makes it difficult to fully assess the relative strengths and weaknesses of their approach. Expanding the experimental evaluation could help strengthen the claims of the paper.

Overall, the paper presents an interesting and potentially impactful contribution to the field of RG flow learning, but there are still opportunities for further research and refinement of the proposed techniques.

Conclusion

This paper introduces a Bayesian approach to learning Renormalization Group (RG) flows in the context of neural network field theories. By treating the RG flow as a Bayesian inference problem, the researchers are able to quantify uncertainty in the learned flow and demonstrate improved generalization performance on various model systems.

The Bayesian RG flow method offers several advantages over traditional deterministic approaches, including the ability to handle uncertainty and better adapt to new data distributions. However, the technique also has some limitations, such as computational complexity and sensitivity to architectural choices.

Overall, the paper presents an innovative and promising direction for RG flow learning, with potential applications in physics, machine learning, and other fields where RG techniques are valuable. Further research to address the identified limitations and expand the experimental evaluation could help solidify the impact of this work.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Bayesian RG Flow in Neural Network Field Theories

Jessica N. Howard, Marc S. Klinger, Anindita Maiti, Alexander G. Stapleton

The Neural Network Field Theory correspondence (NNFT) is a mapping from neural network (NN) architectures into the space of statistical field theories (SFTs). The Bayesian renormalization group (BRG) is an information-theoretic coarse graining scheme that generalizes the principles of the Exact Renormalization Group (ERG) to arbitrarily parameterized probability distributions, including those of NNs. In BRG, coarse graining is performed in parameter space with respect to an information-theoretic distinguishability scale set by the Fisher information metric. In this paper, we unify NNFT and BRG to form a powerful new framework for exploring the space of NNs and SFTs, which we coin BRG-NNFT. With BRG-NNFT, NN training dynamics can be interpreted as inducing a flow in the space of SFTs from the information-theoretic `IR' $rightarrow$ `UV'. Conversely, applying an information-shell coarse graining to the trained network's parameters induces a flow in the space of SFTs from the information-theoretic `UV' $rightarrow$ `IR'. When the information-theoretic cutoff scale coincides with a standard momentum scale, BRG is equivalent to ERG. We demonstrate the BRG-NNFT correspondence on two analytically tractable examples. First, we construct BRG flows for trained, infinite-width NNs, of arbitrary depth, with generic activation functions. As a special case, we then restrict to architectures with a single infinitely-wide layer, scalar outputs, and generalized cos-net activations. In this case, we show that BRG coarse-graining corresponds exactly to the momentum-shell ERG flow of a free scalar SFT. Our analytic results are corroborated by a numerical experiment in which an ensemble of asymptotically wide NNs are trained and subsequently renormalized using an information-shell BRG scheme.

5/29/2024

Wilsonian Renormalization of Neural Network Gaussian Processes

Jessica N. Howard, Ro Jefferson, Anindita Maiti, Zohar Ringel

Separating relevant and irrelevant information is key to any modeling process or scientific inquiry. Theoretical physics offers a powerful tool for achieving this in the form of the renormalization group (RG). Here we demonstrate a practical approach to performing Wilsonian RG in the context of Gaussian Process (GP) Regression. We systematically integrate out the unlearnable modes of the GP kernel, thereby obtaining an RG flow of the GP in which the data sets the IR scale. In simple cases, this results in a universal flow of the ridge parameter, which becomes input-dependent in the richer scenario in which non-Gaussianities are included. In addition to being analytically tractable, this approach goes beyond structural analogies between RG and neural networks by providing a natural connection between RG flow and learnable vs. unlearnable modes. Studying such flows may improve our understanding of feature learning in deep neural networks, and enable us to identify potential universality classes in these models.

8/15/2024

🤯

The Inverse of Exact Renormalization Group Flows as Statistical Inference

David S. Berman, Marc S. Klinger

We build on the view of the Exact Renormalization Group (ERG) as an instantiation of Optimal Transport described by a functional convection-diffusion equation. We provide a new information theoretic perspective for understanding the ERG through the intermediary of Bayesian Statistical Inference. This connection is facilitated by the Dynamical Bayesian Inference scheme, which encodes Bayesian inference in the form of a one parameter family of probability distributions solving an integro-differential equation derived from Bayes' law. In this note, we demonstrate how the Dynamical Bayesian Inference equation is, itself, equivalent to a diffusion equation which we dub Bayesian Diffusion. Identifying the features that define Bayesian Diffusion, and mapping them onto the features that define the ERG, we obtain a dictionary outlining how renormalization can be understood as the inverse of statistical inference.

5/2/2024

Bifurcated Generative Flow Networks

Chunhui Li, Cheng-Hao Liu, Dianbo Liu, Qingpeng Cai, Ling Pan

Generative Flow Networks (GFlowNets), a new family of probabilistic samplers, have recently emerged as a promising framework for learning stochastic policies that generate high-quality and diverse objects proportionally to their rewards. However, existing GFlowNets often suffer from low data efficiency due to the direct parameterization of edge flows or reliance on backward policies that may struggle to scale up to large action spaces. In this paper, we introduce Bifurcated GFlowNets (BN), a novel approach that employs a bifurcated architecture to factorize the flows into separate representations for state flows and edge-based flow allocation. This factorization enables BN to learn more efficiently from data and better handle large-scale problems while maintaining the convergence guarantee. Through extensive experiments on standard evaluation benchmarks, we demonstrate that BN significantly improves learning efficiency and effectiveness compared to strong baselines.

6/5/2024