From pixels to planning: scale-free active inference

Read original: arXiv:2407.20292 - Published 7/31/2024 by Karl Friston, Conor Heins, Tim Verbelen, Lancelot Da Costa, Tommaso Salvatori, Dimitrije Markovic, Alexander Tschantz, Magnus Koudahl, Christopher Buckley, Thomas Parr

165

🤯

Overview

This paper introduces a new generative model called the Renormalizing Generative Model (RGM)
RGMs are a type of discrete state-space model that can be used for active inference and learning in dynamic settings
The paper demonstrates how RGMs can be applied to various tasks like image classification, movie/music generation, and Atari game learning

Plain English Explanation

The paper describes a new type of machine learning model called a Renormalizing Generative Model (RGM). RGMs are a kind of discrete state-space model that can be used to learn and generate complex data like images, videos, and game interactions.

The key idea behind RGMs is that they can capture the underlying "paths" or "orbits" in the data, rather than just the individual data points. This allows the model to learn the structure and composition of the data at different scales of space and time. For example, an RGM trained on videos could learn to generate new videos by understanding the common "paths" or sequences of events that occur.

RGMs are similar to deep neural networks, but they have a different mathematical structure that makes them well-suited for tasks that involve sequential, temporal, or spatial reasoning. The authors demonstrate how RGMs can be used for a variety of applications, from image classification to music generation to playing Atari games.

The main advantage of RGMs is that they can uncover the underlying structure and compositionality in complex, dynamic data. This could lead to more robust and generalizable AI systems that can better understand and interact with the world.

Technical Explanation

The core of the paper is the introduction of Renormalizing Generative Models (RGMs), which are a type of discrete state-space model. State-space models are a class of probabilistic models that can capture the dynamics of a system over time.

RGMs generalize Partially Observed Markov Decision Processes (POMDPs) by introducing "paths" as latent variables. This allows the model to learn the underlying structure and composition of the data, rather than just modeling individual data points. The authors show how RGMs can be implemented using deep or hierarchical forms and the renormalization group - a mathematical technique for analyzing systems at different scales.

The authors demonstrate several applications of RGMs, including image classification, movie/music generation, and Atari game learning. In each case, the RGM is able to discover the compositional and temporal structure of the data and use that to generate new examples.

Overall, the key technical contribution of this paper is the introduction of RGMs as a new class of generative models that can capture the underlying structure of complex, dynamic data. This could lead to more powerful and versatile AI systems in the future.

Critical Analysis

The paper presents a novel and promising approach to generative modeling, but there are a few potential limitations and areas for further research:

The paper focuses primarily on demonstrating the capabilities of RGMs on a few selected tasks. More extensive evaluation on a wider range of datasets and applications would be needed to fully assess the generalizability and scalability of the approach.
The authors mention that RGMs can be computationally expensive, especially for large-scale or high-dimensional data. Techniques for improving the efficiency and scalability of RGMs would be an important area for future work.
The paper does not provide a detailed comparison of RGMs to other state-of-the-art generative models, such as Variational Autoencoders or Generative Adversarial Networks. Understanding how RGMs perform relative to these other approaches would help contextualize the strengths and limitations of the method.
The applications demonstrated in the paper, while interesting, are relatively narrow in scope. Exploring the use of RGMs for more real-world, complex tasks would help better assess their practical utility and potential impact.

Overall, the Renormalizing Generative Model approach presented in this paper is an intriguing and potentially powerful new direction in generative modeling. With further research and development, RGMs could become an important tool for building more robust and versatile AI systems.

Conclusion

This paper introduces a new type of generative model called the Renormalizing Generative Model (RGM), which is a discrete state-space model that can capture the underlying structure and composition of complex, dynamic data. The authors demonstrate how RGMs can be applied to a variety of tasks, including image classification, movie/music generation, and Atari game learning.

The key innovation of RGMs is their ability to learn the "paths" or "orbits" in the data, rather than just modeling individual data points. This allows the models to uncover the compositional and temporal structure of the data, which could lead to more robust and generalizable AI systems.

While the paper presents promising results, there are a few areas for potential improvement and further research, such as improving the computational efficiency of RGMs and exploring their applicability to a wider range of real-world tasks. Overall, the Renormalizing Generative Model approach is an intriguing new direction in generative modeling that could have significant implications for the future of AI.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🤯

165

From pixels to planning: scale-free active inference

Karl Friston, Conor Heins, Tim Verbelen, Lancelot Da Costa, Tommaso Salvatori, Dimitrije Markovic, Alexander Tschantz, Magnus Koudahl, Christopher Buckley, Thomas Parr

This paper describes a discrete state-space model -- and accompanying methods -- for generative modelling. This model generalises partially observed Markov decision processes to include paths as latent variables, rendering it suitable for active inference and learning in a dynamic setting. Specifically, we consider deep or hierarchical forms using the renormalisation group. The ensuing renormalising generative models (RGM) can be regarded as discrete homologues of deep convolutional neural networks or continuous state-space models in generalised coordinates of motion. By construction, these scale-invariant models can be used to learn compositionality over space and time, furnishing models of paths or orbits; i.e., events of increasing temporal depth and itinerancy. This technical note illustrates the automatic discovery, learning and deployment of RGMs using a series of applications. We start with image classification and then consider the compression and generation of movies and music. Finally, we apply the same variational principles to the learning of Atari-like games.

7/31/2024

Learning in Hybrid Active Inference Models

Poppy Collis, Ryan Singh, Paul F Kinghorn, Christopher L Buckley

An open problem in artificial intelligence is how systems can flexibly learn discrete abstractions that are useful for solving inherently continuous problems. Previous work in computational neuroscience has considered this functional integration of discrete and continuous variables during decision-making under the formalism of active inference (Parr, Friston & de Vries, 2017; Parr & Friston, 2018). However, their focus is on the expressive physical implementation of categorical decisions and the hierarchical mixed generative model is assumed to be known. As a consequence, it is unclear how this framework might be extended to learning. We therefore present a novel hierarchical hybrid active inference agent in which a high-level discrete active inference planner sits above a low-level continuous active inference controller. We make use of recent work in recurrent switching linear dynamical systems (rSLDS) which implement end-to-end learning of meaningful discrete representations via the piecewise linear decomposition of complex continuous dynamics (Linderman et al., 2016). The representations learned by the rSLDS inform the structure of the hybrid decision-making agent and allow us to (1) specify temporally-abstracted sub-goals in a method reminiscent of the options framework, (2) lift the exploration into discrete space allowing us to exploit information-theoretic exploration bonuses and (3) `cache' the approximate solutions to low-level problems in the discrete planner. We apply our model to the sparse Continuous Mountain Car task, demonstrating fast system identification via enhanced exploration and successful planning through the delineation of abstract sub-goals.

9/4/2024

eXponential FAmily Dynamical Systems (XFADS): Large-scale nonlinear Gaussian state-space modeling

Matthew Dowling, Yuan Zhao, Il Memming Park

State-space graphical models and the variational autoencoder framework provide a principled apparatus for learning dynamical systems from data. State-of-the-art probabilistic approaches are often able to scale to large problems at the cost of flexibility of the variational posterior or expressivity of the dynamics model. However, those consolidations can be detrimental if the ultimate goal is to learn a generative model capable of explaining the spatiotemporal structure of the data and making accurate forecasts. We introduce a low-rank structured variational autoencoding framework for nonlinear Gaussian state-space graphical models capable of capturing dense covariance structures that are important for learning dynamical systems with predictive capabilities. Our inference algorithm exploits the covariance structures that arise naturally from sample based approximate Gaussian message passing and low-rank amortized posterior updates -- effectively performing approximate variational smoothing with time complexity scaling linearly in the state dimensionality. In comparisons with other deep state-space model architectures our approach consistently demonstrates the ability to learn a more predictive generative model. Furthermore, when applied to neural physiological recordings, our approach is able to learn a dynamical system capable of forecasting population spiking and behavioral correlates from a small portion of single trials.

6/3/2024

Bayesian Learning in a Nonlinear Multiscale State-Space Model

Nayely V'elez-Cruz, Manfred D. Laubichler

The ubiquity of multiscale interactions in complex systems is well-recognized, with development and heredity serving as a prime example of how processes at different temporal scales influence one another. This work introduces a novel multiscale state-space model to explore the dynamic interplay between systems interacting across different time scales, with feedback between each scale. We propose a Bayesian learning framework to estimate unknown states by learning the unknown process noise covariances within this multiscale model. We develop a Particle Gibbs with Ancestor Sampling (PGAS) algorithm for inference and demonstrate through simulations the efficacy of our approach.

8/28/2024