Improving equilibrium propagation without weight symmetry through Jacobian homeostasis

Read original: arXiv:2309.02214 - Published 4/9/2024 by Axel Laborieux, Friedemann Zenke

🎲

Overview

Equilibrium propagation (EP) is an alternative to backpropagation (BP) for computing gradients in neural networks, particularly for biological or analog neuromorphic substrates.
EP requires weight symmetry and infinitesimal equilibrium perturbations, which can be challenging to implement in physical systems.
This paper investigates the impact of weight asymmetry on the applicability of EP.

Plain English Explanation

Equilibrium propagation (EP) is a way of training neural networks that's different from the standard backpropagation (BP) algorithm. EP is particularly useful for neural networks that are designed to work with biological or analog hardware, rather than digital computers.

The key advantage of EP is that it doesn't require the network's weights (the strength of the connections between neurons) to be perfectly symmetric. BP, on the other hand, works best when the weights are symmetric. This symmetry is often hard to achieve in real-world physical systems.

However, EP has its own challenges. It requires the network to be "nudged" with very small changes, and it also assumes the weights are symmetric, even if they aren't perfectly so. The paper investigates what happens when these assumptions are violated - in other words, when the weights are asymmetric and the nudges aren't infinitely small.

The researchers find that the finite size of the nudges isn't a major problem - they can still accurately estimate the gradients using a mathematical technique called a Cauchy integral. But the weight asymmetry does introduce biases that can significantly impair the network's ability to solve complex tasks, like recognizing objects in the ImageNet 32x32 dataset.

To address this issue, the researchers propose a new "homeostatic objective" that directly penalizes any functional asymmetries in the network. This dramatically improves the network's performance on these challenging tasks, bringing it closer to the level achieved by standard backpropagation.

The findings from this paper lay the groundwork for developing learning algorithms that can work effectively with the imperfections of real-world physical neural networks, such as those used in quantum neural networks or molecular templating.

Technical Explanation

The paper presents a study of generalized equilibrium propagation (GEP), which is a formulation of the EP algorithm that can be used with non-symmetric weights. The researchers analytically isolate the two sources of bias in GEP: the finite size of the "nudges" used to estimate gradients, and the weight asymmetry itself.

They show that the finite nudge size does not pose a significant problem, as the exact derivatives can still be estimated using a Cauchy integral. However, the weight asymmetry does introduce bias, resulting in poor alignment between the network's "error vectors" (a measure of how much each neuron's activity should change) compared to backpropagation.

To mitigate this issue, the researchers introduce a new "homeostatic objective" that directly penalizes functional asymmetries in the network's Jacobian (a matrix that describes how the network's outputs change in response to changes in its inputs) at the fixed point. Applying this homeostatic objective dramatically improves the network's performance on complex tasks like ImageNet 32x32 classification.

The paper's findings lay the groundwork for studying and mitigating the effects of imperfections in physical neural networks, such as those used in quantum neural networks or energy-based models, on learning algorithms that rely on the substrate's relaxation dynamics.

Critical Analysis

The paper provides a thorough analysis of the impact of weight asymmetry on the applicability of equilibrium propagation. The researchers have done a commendable job of isolating the different sources of bias and analytically characterizing their effects.

One potential limitation of the study is that it focuses on the idealized case of complex-differentiable networks. In practice, real-world physical networks may exhibit more complex, non-differentiable dynamics that could pose additional challenges. The researchers acknowledge this and suggest that further research is needed to understand the implications for a broader class of physical substrates.

Additionally, while the proposed homeostatic objective effectively mitigates the issues caused by weight asymmetry, it remains to be seen how this objective would perform in more realistic, noisy physical systems. The researchers mention that further investigations are needed to understand the interplay between the homeostatic objective and other sources of imperfections, such as finite precision, device variability, and noise.

Overall, this paper makes an important contribution to the field of neuromorphic computing by shedding light on the limitations of equilibrium propagation and proposing a promising solution. The findings could have significant implications for the design and optimization of physical neural networks that can effectively leverage the power of gradient-based learning.

Conclusion

This paper investigates the impact of weight asymmetry on the applicability of equilibrium propagation (EP), an alternative to backpropagation for training neural networks on biological or analog hardware. The researchers find that while the finite size of the "nudges" used in EP does not pose a major problem, weight asymmetry can introduce significant biases that impair the network's performance on complex tasks.

To address this issue, the researchers propose a new "homeostatic objective" that directly penalizes functional asymmetries in the network's Jacobian. This dramatically improves the network's ability to solve challenging problems, such as ImageNet 32x32 classification.

The findings from this paper lay the groundwork for developing learning algorithms that can work effectively with the imperfections of real-world physical neural networks, paving the way for more robust and efficient neuromorphic computing systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🎲

Improving equilibrium propagation without weight symmetry through Jacobian homeostasis

Axel Laborieux, Friedemann Zenke

Equilibrium propagation (EP) is a compelling alternative to the backpropagation of error algorithm (BP) for computing gradients of neural networks on biological or analog neuromorphic substrates. Still, the algorithm requires weight symmetry and infinitesimal equilibrium perturbations, i.e., nudges, to estimate unbiased gradients efficiently. Both requirements are challenging to implement in physical systems. Yet, whether and how weight asymmetry affects its applicability is unknown because, in practice, it may be masked by biases introduced through the finite nudge. To address this question, we study generalized EP, which can be formulated without weight symmetry, and analytically isolate the two sources of bias. For complex-differentiable non-symmetric networks, we show that the finite nudge does not pose a problem, as exact derivatives can still be estimated via a Cauchy integral. In contrast, weight asymmetry introduces bias resulting in low task performance due to poor alignment of EP's neuronal error vectors compared to BP. To mitigate this issue, we present a new homeostatic objective that directly penalizes functional asymmetries of the Jacobian at the network's fixed point. This homeostatic objective dramatically improves the network's ability to solve complex tasks such as ImageNet 32x32. Our results lay the theoretical groundwork for studying and mitigating the adverse effects of imperfections of physical networks on learning algorithms that rely on the substrate's relaxation dynamics.

4/9/2024

Quantum Equilibrium Propagation for efficient training of quantum systems based on Onsager reciprocity

Clara C. Wanjura, Florian Marquardt

The widespread adoption of machine learning and artificial intelligence in all branches of science and technology has created a need for energy-efficient, alternative hardware platforms. While such neuromorphic approaches have been proposed and realised for a wide range of platforms, physically extracting the gradients required for training remains challenging as generic approaches only exist in certain cases. Equilibrium propagation (EP) is such a procedure that has been introduced and applied to classical energy-based models which relax to an equilibrium. Here, we show a direct connection between EP and Onsager reciprocity and exploit this to derive a quantum version of EP. This can be used to optimize loss functions that depend on the expectation values of observables of an arbitrary quantum system. Specifically, we illustrate this new concept with supervised and unsupervised learning examples in which the input or the solvable task is of quantum mechanical nature, e.g., the recognition of quantum many-body ground states, quantum phase exploration, sensing and phase boundary exploration. We propose that in the future quantum EP may be used to solve tasks such as quantum phase discovery with a quantum simulator even for Hamiltonians which are numerically hard to simulate or even partially unknown. Our scheme is relevant for a variety of quantum simulation platforms such as ion chains, superconducting qubit arrays, neutral atom Rydberg tweezer arrays and strongly interacting atoms in optical lattices.

6/11/2024

🏋️

Quantum Equilibrium Propagation: Gradient-Descent Training of Quantum Systems

Benjamin Scellier

Equilibrium propagation (EP) is a training framework for energy-based systems, i.e. systems whose physics minimizes an energy function. EP has been explored in various classical physical systems such as resistor networks, elastic networks, the classical Ising model and coupled phase oscillators. A key advantage of EP is that it achieves gradient descent on a cost function using the physics of the system to extract the weight gradients, making it a candidate for the development of energy-efficient processors for machine learning. We extend EP to quantum systems, where the energy function that is minimized is the mean energy functional (expectation value of the Hamiltonian), whose minimum is the ground state of the Hamiltonian. As examples, we study the settings of the transverse-field Ising model and the quantum harmonic oscillator network -- quantum analogues of the Ising model and elastic network.

6/4/2024

Scaling SNNs Trained Using Equilibrium Propagation to Convolutional Architectures

Jiaqi Lin, Malyaban Bal, Abhronil Sengupta

Equilibrium Propagation (EP) is a biologically plausible local learning algorithm initially developed for convergent recurrent neural networks (RNNs), where weight updates rely solely on the connecting neuron states across two phases. The gradient calculations in EP have been shown to approximate the gradients computed by Backpropagation Through Time (BPTT) when an infinitesimally small nudge factor is used. This property makes EP a powerful candidate for training Spiking Neural Networks (SNNs), which are commonly trained by BPTT. However, in the spiking domain, previous studies on EP have been limited to architectures involving few linear layers. In this work, for the first time we provide a formulation for training convolutional spiking convergent RNNs using EP, bridging the gap between spiking and non-spiking convergent RNNs. We demonstrate that for spiking convergent RNNs, there is a mismatch in the maximum pooling and its inverse operation, leading to inaccurate gradient estimation in EP. Substituting this with average pooling resolves this issue and enables accurate gradient estimation for spiking convergent RNNs. We also highlight the memory efficiency of EP compared to BPTT. In the regime of SNNs trained by EP, our experimental results indicate state-of-the-art performance on the MNIST and FashionMNIST datasets, with test errors of 0.97% and 8.89%, respectively. These results are comparable to those of convergent RNNs and SNNs trained by BPTT. These findings underscore EP as an optimal choice for on-chip training and a biologically-plausible method for computing error gradients.

7/4/2024