Biology-inspired joint distribution neurons based on Hierarchical Correlation Reconstruction allowing for multidirectional neural networks

2405.05097

Published 6/21/2024 by Jarek Duda

🧠

Abstract

Popular artificial neural networks (ANN) optimize parameters for unidirectional value propagation, assuming some arbitrary parametrization type like Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN). In contrast, for biological neurons e.g. it is not uncommon for axonal propagation of action potentials to happen in both directions~cite{axon} - suggesting they are optimized to continuously operate in multidirectional way. Additionally, statistical dependencies a single neuron could model is not just (expected) value dependence, but entire joint distributions including also higher moments. Such more agnostic joint distribution neuron would allow for multidirectional propagation (of distributions or values) e.g. $rho(x|y,z)$ or $rho(y,z|x)$ by substituting to $rho(x,y,z)$ and normalizing. There will be discussed Hierarchical Correlation Reconstruction (HCR) for such neuron model: assuming $rho(x,y,z)=sum_{ijk} a_{ijk} f_i(x) f_j(y) f_k(z)$ type parametrization of joint distribution in polynomial basis $f_i$, which allows for flexible, inexpensive processing including nonlinearities, direct model estimation and update, trained through standard backpropagation or novel ways for such structure up to tensor decomposition or information bottleneck approach. Using only pairwise (input-output) dependencies, its expected value prediction becomes KAN-like with trained activation functions as polynomials, can be extended by adding higher order dependencies through included products - in conscious interpretable way, allowing for multidirectional propagation of both values and probability densities.

Create account to get full access

Overview

Popular artificial neural networks (ANNs) optimize parameters for unidirectional value propagation, assuming a specific parametrization like Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN).
Biological neurons can propagate action potentials bidirectionally, suggesting they are optimized for multidirectional operation.
A single neuron could model statistical dependencies beyond just expected value, including entire joint distributions and higher moments.
The paper discusses Hierarchical Correlation Reconstruction (HCR), a neuron model that allows for flexible, inexpensive processing of multidirectional propagation of both values and probability densities.

Plain English Explanation

Artificial neural networks (ANNs) are a type of machine learning model inspired by the human brain. Typically, these models are designed to propagate information in a single direction, from the input to the output. This means they optimize their parameters to make predictions based on a specific type of input-output relationship, like a Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN).

However, real biological neurons in the brain can transmit signals in both directions along their axons. This suggests that biological neurons are optimized to operate in a more multidirectional way, rather than just unidirectionally. Additionally, a single neuron in the brain may be able to model more complex statistical dependencies, not just the expected value of the output, but the entire joint distribution of the input and output variables, including higher moments like variance and skewness.

The paper introduces a neuron model called Hierarchical Correlation Reconstruction (HCR) that aims to capture this multidirectional and more flexible statistical modeling. HCR assumes a specific parametrization of the joint distribution of the inputs and outputs, which allows for efficient processing of both values and probability densities in multiple directions. This could lead to more accurate and robust artificial neural networks that are better aligned with the way biological neurons operate.

Technical Explanation

The paper proposes a neuron model called Hierarchical Correlation Reconstruction (HCR) that aims to go beyond the unidirectional value propagation assumptions of popular artificial neural network (ANN) architectures like Multi-Layer Perceptrons (MLPs) and Kolmogorov-Arnold Networks (KANs).

The key idea is that biological neurons often exhibit bidirectional propagation of action potentials along their axons, suggesting they are optimized for multidirectional operation. Additionally, a single neuron may be able to model not just the expected value dependence between inputs and outputs, but the entire joint probability distribution, including higher moments like variance and skewness.

The HCR neuron model assumes a specific parametrization of the joint distribution, $\rho(x,y,z) = \sum_{ijk} a_{ijk} f_i(x) f_j(y) f_k(z)$, where $f_i$ are a polynomial basis. This allows for flexible, inexpensive processing of multidirectional propagation of both values and probability densities, such as $\rho(x|y,z)$ or $\rho(y,z|x)$, by substituting and normalizing the joint distribution.

The authors show that using only pairwise (input-output) dependencies, the expected value prediction of HCR becomes KAN-like, with trained activation functions as polynomials. This can be extended by adding higher-order dependencies through the included products, in an interpretable way that allows for multidirectional propagation.

Critical Analysis

The paper presents an interesting neuron model that aims to capture more complex statistical dependencies and multidirectional propagation, which could lead to more accurate and robust artificial neural networks. However, there are a few potential caveats and areas for further research:

The paper focuses on the theoretical formulation of the HCR neuron model, but does not provide extensive experimental validation or comparisons to other state-of-the-art neuron models like Hebbian learning or task-specific neuron architectures. Empirical evaluations on real-world tasks would help demonstrate the practical benefits of the HCR approach.
The computational complexity and scalability of the HCR model are not thoroughly discussed. As the number of input and output variables increases, the number of parameters in the joint distribution parametrization may grow rapidly, potentially leading to challenges in training and inference.
The paper does not address how the HCR model could be integrated into larger hierarchical neural network architectures or how it might interact with other biologically-inspired neuron models and learning rules.

Overall, the HCR neuron model presents an interesting theoretical direction for exploring more flexible and biologically-plausible neuron representations in artificial neural networks. Further empirical validation and integration with other advancements in neural network architecture and learning could help assess the practical significance of this approach.

Conclusion

The paper introduces the Hierarchical Correlation Reconstruction (HCR) neuron model, which aims to go beyond the unidirectional value propagation assumptions of popular artificial neural network architectures. HCR allows for flexible, inexpensive processing of multidirectional propagation of both values and probability densities, inspired by the bidirectional signal transmission observed in biological neurons.

By modeling the entire joint distribution of inputs and outputs, rather than just expected value dependencies, HCR could lead to more accurate and robust artificial neural networks that better capture the complex statistical relationships present in real-world data. However, further empirical validation, analysis of computational complexity, and integration with other biologically-inspired neuron models are needed to fully assess the potential impact of this approach.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Two Tales of Single-Phase Contrastive Hebbian Learning

Rasmus Kj{ae}r H{o}ier, Christopher Zach

The search for ``biologically plausible'' learning algorithms has converged on the idea of representing gradients as activity differences. However, most approaches require a high degree of synchronization (distinct phases during learning) and introduce substantial computational overhead, which raises doubts regarding their biological plausibility as well as their potential utility for neuromorphic computing. Furthermore, they commonly rely on applying infinitesimal perturbations (nudges) to output units, which is impractical in noisy environments. Recently it has been shown that by modelling artificial neurons as dyads with two oppositely nudged compartments, it is possible for a fully local learning algorithm named ``dual propagation'' to bridge the performance gap to backpropagation, without requiring separate learning phases or infinitesimal nudging. However, the algorithm has the drawback that its numerical stability relies on symmetric nudging, which may be restrictive in biological and analog implementations. In this work we first provide a solid foundation for the objective underlying the dual propagation method, which also reveals a surprising connection with adversarial robustness. Second, we demonstrate how dual propagation is related to a particular adjoint state method, which is stable regardless of asymmetric nudging.

6/26/2024

cs.LG cs.NE

🧠

Dendrites endow artificial neural networks with accurate, robust and parameter-efficient learning

Spyridon Chavlis, Panayiota Poirazi

Artificial neural networks (ANNs) are at the core of most Deep learning (DL) algorithms that successfully tackle complex problems like image recognition, autonomous driving, and natural language processing. However, unlike biological brains who tackle similar problems in a very efficient manner, DL algorithms require a large number of trainable parameters, making them energy-intensive and prone to overfitting. Here, we show that a new ANN architecture that incorporates the structured connectivity and restricted sampling properties of biological dendrites counteracts these limitations. We find that dendritic ANNs are more robust to overfitting and outperform traditional ANNs on several image classification tasks while using significantly fewer trainable parameters. This is achieved through the adoption of a different learning strategy, whereby most of the nodes respond to several classes, unlike classical ANNs that strive for class-specificity. These findings suggest that the incorporation of dendrites can make learning in ANNs precise, resilient, and parameter-efficient and shed new light on how biological features can impact the learning strategies of ANNs.

4/8/2024

cs.NE cs.LG

📉

Neuron-centric Hebbian Learning

Andrea Ferigo, Elia Cunegatti, Giovanni Iacca

One of the most striking capabilities behind the learning mechanisms of the brain is the adaptation, through structural and functional plasticity, of its synapses. While synapses have the fundamental role of transmitting information across the brain, several studies show that it is the neuron activations that produce changes on synapses. Yet, most plasticity models devised for artificial Neural Networks (NNs), e.g., the ABCD rule, focus on synapses, rather than neurons, therefore optimizing synaptic-specific Hebbian parameters. This approach, however, increases the complexity of the optimization process since each synapse is associated to multiple Hebbian parameters. To overcome this limitation, we propose a novel plasticity model, called Neuron-centric Hebbian Learning (NcHL), where optimization focuses on neuron- rather than synaptic-specific Hebbian parameters. Compared to the ABCD rule, NcHL reduces the parameters from $5W$ to $5N$, being $W$ and $N$ the number of weights and neurons, and usually $N ll W$. We also devise a ``weightless'' NcHL model, which requires less memory by approximating the weights based on a record of neuron activations. Our experiments on two robotic locomotion tasks reveal that NcHL performs comparably to the ABCD rule, despite using up to $sim97$ times less parameters, thus allowing for scalable plasticity

4/17/2024

cs.NE cs.AI cs.LG

🧠

Growing Artificial Neural Networks for Control: the Role of Neuronal Diversity

Eleni Nisioti, Erwan Plantec, Milton Montero, Joachim Winther Pedersen, Sebastian Risi

In biological evolution complex neural structures grow from a handful of cellular ingredients. As genomes in nature are bounded in size, this complexity is achieved by a growth process where cells communicate locally to decide whether to differentiate, proliferate and connect with other cells. This self-organisation is hypothesized to play an important part in the generalisation, and robustness of biological neural networks. Artificial neural networks (ANNs), on the other hand, are traditionally optimized in the space of weights. Thus, the benefits and challenges of growing artificial neural networks remain understudied. Building on the previously introduced Neural Developmental Programs (NDP), in this work we present an algorithm for growing ANNs that solve reinforcement learning tasks. We identify a key challenge: ensuring phenotypic complexity requires maintaining neuronal diversity, but this diversity comes at the cost of optimization stability. To address this, we introduce two mechanisms: (a) equipping neurons with an intrinsic state inherited upon neurogenesis; (b) lateral inhibition, a mechanism inspired by biological growth, which controlls the pace of growth, helping diversity persist. We show that both mechanisms contribute to neuronal diversity and that, equipped with them, NDPs achieve comparable results to existing direct and developmental encodings in complex locomotion tasks

5/15/2024

cs.NE cs.AI