Breaching the Bottleneck: Evolutionary Transition from Reward-Driven Learning to Reward-Agnostic Domain-Adapted Learning in Neuromodulated Neural Nets

Read original: arXiv:2404.12631 - Published 8/6/2024 by Solvi Arnold, Reiji Suzuki, Takaya Arita, Kimitoshi Yamazaki

🧠

Overview

Advanced biological intelligence can learn efficiently from a rich stream of information, even without explicit feedback on the quality of its behavior.
This type of learning, called Domain-Adapted Learning (DAL), exploits implicit assumptions about the task domain.
In contrast, AI learning algorithms rely on explicit external measures of behavior quality, which creates an information bottleneck that limits their learning efficiency.
The paper explores how biological evolution might have circumvented this bottleneck to produce DAL.

Plain English Explanation

Biological organisms, like humans and animals, have a remarkable ability to learn efficiently from the wealth of information they encounter in their environments, even when they don't receive clear feedback on the quality of their actions. This type of learning, called Domain-Adapted Learning (DAL), allows them to make use of implicit assumptions about the task domain to learn quickly and effectively.

In comparison, current AI learning algorithms rely heavily on explicit external measures of behavior quality, such as rewards or error signals, to guide their learning. This creates an information bottleneck that limits their ability to learn from the diverse, non-reward-based information available in the real world. As a result, AI systems often struggle to match the learning efficiency of biological intelligence.

The paper explores a potential evolutionary pathway that could have allowed biological systems to overcome this bottleneck and develop the capacity for DAL. The key idea is that species may have first evolved the ability to learn from reward signals, providing a broad but inefficient form of adaptivity. Over time, the integration of non-reward information into the learning process could have been gradually refined, leading to the emergence of more advanced, bottleneck-free, domain-adapted learning.

Technical Explanation

The paper proposes a two-phase evolutionary scenario to explain the emergence of Domain-Adapted Learning (DAL) in biological intelligence. In the first phase, species evolve the ability to learn from reward signals, which provides a broad but inefficient form of adaptivity. In the second phase, evolution integrates non-reward information into the learning process, gradually improving learning efficiency.

To investigate the second phase, the researchers set up a population of neural networks (NNs) with reward-driven learning modeled as Reinforcement Learning (A2C). They then allowed evolution to improve learning efficiency by incorporating non-reward information into the learning process using a neuromodulatory update mechanism.

In a continuous 2D navigation task, the evolved DAL agents showed a 300-fold increase in learning speed compared to pure RL agents. Interestingly, the evolution process was found to eliminate reliance on reward information altogether, allowing the DAL agents to learn from non-reward information exclusively, using only local neuromodulation-based connection weight updates.

This research provides a biologically plausible pathway for the development of Domain-Adapted Learning, highlighting the potential of neuroevolutionary approaches and neuron-centric Hebbian learning to overcome the information bottleneck that limits the learning efficiency of current AI algorithms.

Critical Analysis

The paper presents a compelling evolutionary scenario for the emergence of Domain-Adapted Learning (DAL) in biological intelligence. However, the authors acknowledge that this is a hypothetical pathway, and further research would be needed to validate the proposed mechanism.

One potential limitation of the study is that it focuses solely on a navigation task in continuous 2D space. While this provides a useful testbed, it may not fully capture the complexity and diversity of real-world learning tasks that biological organisms face. Further research could explore the performance of the evolved DAL agents on a wider range of tasks, including those with more complex structure and dynamics.

Additionally, the paper does not delve into the specific mechanisms underlying the neuromodulatory update process that allows the DAL agents to learn from non-reward information. A more detailed exploration of the neural dynamics and information processing involved could provide valuable insights into the biological plausibility and potential applications of this approach.

Despite these potential areas for further investigation, the paper's central idea of overcoming the information bottleneck through a gradual evolutionary process is thought-provoking and could have significant implications for the development of more efficient and flexible AI learning algorithms. Readers are encouraged to critically examine the research and consider how the insights presented could inform the ongoing quest to bridge the gap between artificial and biological intelligence.

Conclusion

This paper explores a biologically plausible pathway for the emergence of Domain-Adapted Learning (DAL) in biological intelligence. It proposes that species first evolve the ability to learn from reward signals, providing broad but inefficient adaptivity, and then gradually integrate non-reward information into the learning process to overcome the information bottleneck that limits current AI learning algorithms.

The researchers' experiments on a continuous 2D navigation task demonstrate the potential of this approach, with evolved DAL agents showing a remarkable 300-fold increase in learning speed compared to pure Reinforcement Learning (RL) agents. Notably, the evolution process was able to eliminate reliance on reward information altogether, allowing the DAL agents to learn exclusively from non-reward information using local neuromodulation-based connection weight updates.

This research highlights the power of neuroevolutionary approaches and neuron-centric Hebbian learning in overcoming the limitations of traditional AI learning algorithms. By taking inspiration from the efficient and flexible learning strategies observed in biological systems, this work opens up new avenues for the development of more robust and adaptable artificial intelligence that can better navigate the complex, information-rich environments of the real world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🧠

Breaching the Bottleneck: Evolutionary Transition from Reward-Driven Learning to Reward-Agnostic Domain-Adapted Learning in Neuromodulated Neural Nets

Solvi Arnold, Reiji Suzuki, Takaya Arita, Kimitoshi Yamazaki

Advanced biological intelligence learns efficiently from an information-rich stream of stimulus information, even when feedback on behaviour quality is sparse or absent. Such learning exploits implicit assumptions about task domains. We refer to such learning as Domain-Adapted Learning (DAL). In contrast, AI learning algorithms rely on explicit externally provided measures of behaviour quality to acquire fit behaviour. This imposes an information bottleneck that precludes learning from diverse non-reward stimulus information, limiting learning efficiency. We consider the question of how biological evolution circumvents this bottleneck to produce DAL. We propose that species first evolve the ability to learn from reward signals, providing inefficient (bottlenecked) but broad adaptivity. From there, integration of non-reward information into the learning process can proceed via gradual accumulation of biases induced by such information on specific task domains. This scenario provides a biologically plausible pathway towards bottleneck-free, domain-adapted learning. Focusing on the second phase of this scenario, we set up a population of NNs with reward-driven learning modelled as Reinforcement Learning (A2C), and allow evolution to improve learning efficiency by integrating non-reward information into the learning process using a neuromodulatory update mechanism. On a navigation task in continuous 2D space, evolved DAL agents show a 300-fold increase in learning speed compared to pure RL agents. Evolution is found to eliminate reliance on reward information altogether, allowing DAL agents to learn from non-reward information exclusively, using local neuromodulation-based connection weight updates only. Code available at github.com/aislab/dal.

8/6/2024

Lifelong Reinforcement Learning via Neuromodulation

Sebastian Lee, Samuel Liebana Garcia, Claudia Clopath, Will Dabney

Navigating multiple tasks$unicode{x2014}$for instance in succession as in continual or lifelong learning, or in distributions as in meta or multi-task learning$unicode{x2014}$requires some notion of adaptation. Evolution over timescales of millennia has imbued humans and other animals with highly effective adaptive learning and decision-making strategies. Central to these functions are so-called neuromodulatory systems. In this work we introduce an abstract framework for integrating theories and evidence from neuroscience and the cognitive sciences into the design of adaptive artificial reinforcement learning algorithms. We give a concrete instance of this framework built on literature surrounding the neuromodulators Acetylcholine (ACh) and Noradrenaline (NA), and empirically validate the effectiveness of the resulting adaptive algorithm in a non-stationary multi-armed bandit problem. We conclude with a theory-based experiment proposal providing an avenue to link our framework back to efforts in experimental neuroscience.

8/19/2024

🧠

Enhancing learning in artificial neural networks through cellular heterogeneity and neuromodulatory signaling

Alejandro Rodriguez-Garcia, Jie Mei, Srikanth Ramaswamy

Recent progress in artificial intelligence (AI) has been driven by insights from neuroscience, particularly with the development of artificial neural networks (ANNs). This has significantly enhanced the replication of complex cognitive tasks such as vision and natural language processing. Despite these advances, ANNs struggle with continual learning, adaptable knowledge transfer, robustness, and resource efficiency - capabilities that biological systems handle seamlessly. Specifically, ANNs often overlook the functional and morphological diversity of the brain, hindering their computational capabilities. Furthermore, incorporating cell-type specific neuromodulatory effects into ANNs with neuronal heterogeneity could enable learning at two spatial scales: spiking behavior at the neuronal level, and synaptic plasticity at the circuit level, thereby potentially enhancing their learning abilities. In this article, we summarize recent bio-inspired models, learning rules and architectures and propose a biologically-informed framework for enhancing ANNs. Our proposed dual-framework approach highlights the potential of spiking neural networks (SNNs) for emulating diverse spiking behaviors and dendritic compartments to simulate morphological and functional diversity of neuronal computations. Finally, we outline how the proposed approach integrates brain-inspired compartmental models and task-driven SNNs, balances bioinspiration and complexity, and provides scalable solutions for pressing AI challenges, such as continual learning, adaptability, robustness, and resource-efficiency.

9/17/2024

Biological Neurons Compete with Deep Reinforcement Learning in Sample Efficiency in a Simulated Gameworld

Moein Khajehnejad, Forough Habibollahi, Aswin Paul, Adeel Razi, Brett J. Kagan

How do biological systems and machine learning algorithms compare in the number of samples required to show significant improvements in completing a task? We compared the learning efficiency of in vitro biological neural networks to the state-of-the-art deep reinforcement learning (RL) algorithms in a simplified simulation of the game `Pong'. Using DishBrain, a system that embodies in vitro neural networks with in silico computation using a high-density multi-electrode array, we contrasted the learning rate and the performance of these biological systems against time-matched learning from three state-of-the-art deep RL algorithms (i.e., DQN, A2C, and PPO) in the same game environment. This allowed a meaningful comparison between biological neural systems and deep RL. We find that when samples are limited to a real-world time course, even these very simple biological cultures outperformed deep RL algorithms across various game performance characteristics, implying a higher sample efficiency. Ultimately, even when tested across multiple types of information input to assess the impact of higher dimensional data input, biological neurons showcased faster learning than all deep reinforcement learning agents.

5/28/2024