Transferable Boltzmann Generators






Published 6/21/2024 by Leon Klein, Frank No'e
Transferable Boltzmann Generators


The generation of equilibrium samples of molecular systems has been a long-standing problem in statistical physics. Boltzmann Generators are a generative machine learning method that addresses this issue by learning a transformation via a normalizing flow from a simple prior distribution to the target Boltzmann distribution of interest. Recently, flow matching has been employed to train Boltzmann Generators for small molecular systems in Cartesian coordinates. We extend this work and propose a first framework for Boltzmann Generators that are transferable across chemical space, such that they predict zero-shot Boltzmann distributions for test molecules without being retrained for these systems. These transferable Boltzmann Generators allow approximate sampling from the target distribution of unseen systems, as well as efficient reweighting to the target Boltzmann distribution. The transferability of the proposed framework is evaluated on dipeptides, where we show that it generalizes efficiently to unseen systems. Furthermore, we demonstrate that our proposed architecture enhances the efficiency of Boltzmann Generators trained on single molecular systems.

Create account to get full access


If you already have an account, we'll log you in


  • This paper introduces "Transferable Boltzmann Generators" (TBGs), a new class of generative models that can learn to sample from complex probability distributions by simulating the dynamics of physical systems.
  • TBGs build on previous work on Boltzmann Generators and Normalizing Flows, combining their strengths to create a powerful and versatile generative modeling approach.
  • The key innovation is the ability of TBGs to "transfer" their learned capabilities to new target distributions, enabling efficient sampling and exploration of diverse statistical landscapes.

Plain English Explanation

TBGs are a new type of generative model that can learn to generate complex data by simulating the behavior of physical systems. This is similar to how Boltzmann Generators work, but TBGs have an additional capability - they can "transfer" their learned skills to new types of data, making them much more versatile.

Generative models are a type of AI that can create new data that looks similar to real-world examples, like images or text. TBGs do this by learning to mimic the mathematical patterns and laws that govern physical systems, such as how molecules move and interact. By simulating these physical dynamics, TBGs can generate new data that has the same statistical properties as the real thing.

The key advance with TBGs is that they can take the knowledge they've gained from one type of data and apply it to generate samples from a completely different distribution. For example, a TBG trained on molecular structures could then be used to generate new drug candidates or protein shapes. This "transfer learning" capability makes TBGs much more flexible and powerful than previous generative models.

Technical Explanation

TBGs build on the strengths of Boltzmann Generators and Normalizing Flows. Like Boltzmann Generators, they learn to simulate the dynamics of physical systems to generate samples. And like Normalizing Flows, they use a flexible neural network architecture to transform a simple distribution into the complex target distribution.

The key innovation is the "transfer learning" capability of TBGs. By incorporating additional neural network modules, TBGs can learn a representation of the target distribution that is decoupled from the specific generative dynamics. This allows the model to be re-purposed to generate samples from new target distributions, without having to re-learn the underlying physical simulation.

The authors demonstrate the effectiveness of TBGs on a range of benchmark tasks, including sampling from complex energy landscapes and molecule generation. They show that TBGs can outperform previous generative modeling approaches in terms of sample quality and computational efficiency.

Critical Analysis

The authors provide a thorough evaluation of TBGs, including comparisons to state-of-the-art generative models like Generative Diffusion Models and Generative Assignment Flows. However, the paper does not delve deeply into the limitations or potential downsides of the TBG approach.

One area that could be explored further is the scalability of TBGs to high-dimensional or particularly complex target distributions. The authors mention that the computational cost of the physical simulation can be a bottleneck, so techniques to improve the efficiency of this component would be valuable.

Additionally, the authors do not discuss potential biases or failure modes of TBGs. As with any generative model, there is a risk of producing samples that do not faithfully represent the true underlying distribution, which could be problematic in safety-critical applications.

Overall, the TBG framework represents an exciting advance in generative modeling, with the potential to unlock new capabilities in areas like molecular design, materials discovery, and beyond. Further research to address the remaining challenges could solidify TBGs as a powerful and versatile tool in the generative modeling toolbox.


The introduction of Transferable Boltzmann Generators represents a significant step forward in the field of generative modeling. By combining the strengths of Boltzmann Generators and Normalizing Flows, TBGs offer a novel approach to learning complex probability distributions through physical simulation.

The key innovation of TBGs is their ability to "transfer" their learned capabilities to new target distributions, enabling efficient sampling and exploration of diverse statistical landscapes. This versatility could have far-reaching implications, opening up new avenues for applications in areas such as molecular design, materials science, and beyond.

While the paper provides a thorough evaluation of the TBG approach, further research is needed to address potential limitations and expand the capabilities of this promising generative modeling framework. As the field of generative AI continues to evolve, TBGs may emerge as a powerful tool for unlocking new discoveries and insights across a wide range of scientific and technological domains.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers


Conditional Normalizing Flows for Active Learning of Coarse-Grained Molecular Representations

Henrik Schopmans, Pascal Friederich





Efficient sampling of the Boltzmann distribution of molecular systems is a long-standing challenge. Recently, instead of generating long molecular dynamics simulations, generative machine learning methods such as normalizing flows have been used to learn the Boltzmann distribution directly, without samples. However, this approach is susceptible to mode collapse and thus often does not explore the full configurational space. In this work, we address this challenge by separating the problem into two levels, the fine-grained and coarse-grained degrees of freedom. A normalizing flow conditioned on the coarse-grained space yields a probabilistic connection between the two levels. To explore the configurational space, we employ coarse-grained simulations with active learning which allows us to update the flow and make all-atom potential energy evaluations only when necessary. Using alanine dipeptide as an example, we show that our methods obtain a speedup to molecular dynamics simulations of approximately 15.9 to 216.2 compared to the speedup of 4.5 of the current state-of-the-art machine learning approach.

Read more


Transition Path Sampling with Boltzmann Generator-based MCMC Moves

Transition Path Sampling with Boltzmann Generator-based MCMC Moves

Michael Plainer, Hannes Stark, Charlotte Bunne, Stephan Gunnemann





Sampling all possible transition paths between two 3D states of a molecular system has various applications ranging from catalyst design to drug discovery. Current approaches to sample transition paths use Markov chain Monte Carlo and rely on time-intensive molecular dynamics simulations to find new paths. Our approach operates in the latent space of a normalizing flow that maps from the molecule's Boltzmann distribution to a Gaussian, where we propose new paths without requiring molecular simulations. Using alanine dipeptide, we explore Metropolis-Hastings acceptance criteria in the latent space for exact sampling and investigate different latent proposal mechanisms.

Read more


Nonequilbrium physics of generative diffusion models

Nonequilbrium physics of generative diffusion models

Zhendong Yu, Haiping Huang





Generative diffusion models apply the concept of Langevin dynamics in physics to machine leaning, attracting a lot of interest from industrial application, but a complete picture about inherent mechanisms is still lacking. In this paper, we provide a transparent physics analysis of the diffusion models, deriving the fluctuation theorem, entropy production, Franz-Parisi potential to understand the intrinsic phase transitions discovered recently. Our analysis is rooted in non-equlibrium physics and concepts from equilibrium physics, i.e., treating both forward and backward dynamics as a Langevin dynamics, and treating the reverse diffusion generative process as a statistical inference, where the time-dependent state variables serve as quenched disorder studied in spin glass theory. This unified principle is expected to guide machine learning practitioners to design better algorithms and theoretical physicists to link the machine learning to non-equilibrium thermodynamics.

Read more


Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data

Bastian Boll, Daniel Gonzalez-Alvarado, Stefania Petra, Christoph Schnorr





We introduce a novel generative model for the representation of joint probability distributions of a possibly large number of discrete random variables. The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions, which also enables to sample efficiently from the target distribution and to assess the likelihood of unseen data points. The embedding of the flow via the Segre map in the meta-simplex of all discrete joint distributions ensures that any target distribution can be represented in principle, whose complexity in practice only depends on the parametrization of the affinity function of the dynamical assignment flow system. Our model can be trained in a simulation-free manner without integration by conditional Riemannian flow matching, using the training data encoded as geodesics in closed-form with respect to the e-connection of information geometry. By projecting high-dimensional flow matching in the meta-simplex of joint distributions to the submanifold of factorizing distributions, our approach has strong motivation from first principles of modeling coupled discrete variables. Numerical experiments devoted to distributions of structured image labelings demonstrate the applicability to large-scale problems, which may include discrete distributions in other application areas. Performance measures show that our approach scales better with the increasing number of classes than recent related work.

Read more
