Meat Meets Machine! Multiscale Competency Enables Causal Learning

2405.02325

Published 5/7/2024 by Michael Timothy Bennett

👀

Abstract

Biological intelligence uses a multiscale competency architecture (MCA). It exhibits adaptive, goal directed behaviour at all scales, from cells to organs to organisms. In contrast, machine intelligence is only adaptive and goal directed at a high level. Learned policies are passively interpreted using abstractions (e.g. arithmetic) embodied in static interpreters (e.g. x86). Biological intelligence excels at causal learning. Machine intelligence does not. Previous work showed causal learning follows from weak policy optimisation, which is hindered by presupposed abstractions in silico. Here we formalise MCAs as nested agentic abstraction layers, to understand how they might learn causes. We show that weak policy optimisation at low levels enables weak policy optimisation at high. This facilitates what we call multiscale causal learning and high level goal directed behaviour. We argue that by engineering human abstractions in silico we disconnect high level goal directed behaviour from the low level goal directed behaviour that gave rise to it. This inhibits causal learning, and we speculate this is one reason why human recall would be accompanied by feeling, and in silico recall not.

Get summaries of the top AI research delivered straight to your inbox:

Overview

Biological intelligence exhibits adaptive, goal-directed behavior at all scales, from cells to organs to organisms, while machine intelligence is only adaptive and goal-directed at a high level.
Biological intelligence excels at causal learning, which is hindered in machine intelligence due to the use of pre-defined abstractions (e.g., arithmetic) embodied in static interpreters (e.g., x86).
The paper formalizes biological intelligence as a Multiscale Competency Architecture (MCA), where nested agentic abstraction layers enable multiscale causal learning and high-level goal-directed behavior.

Plain English Explanation

Biological systems, like our own brains and bodies, are incredibly complex and adaptive. They can learn and respond to their environment in sophisticated ways, adjusting their behavior at all scales, from the individual cells to the entire organism. In contrast, current machine intelligence is only truly adaptive and goal-directed at a high level, relying on pre-defined abstractions and static interpreters to passively process information.

The key difference is that biological intelligence is exceptionally good at causal learning - understanding the underlying mechanisms and causes behind the things it observes. This allows it to flexibly adapt and respond to new situations. Machine intelligence, on the other hand, struggles with causal learning because it is constrained by the abstractions and frameworks we've imposed on it, like mathematical operations and programming languages.

The researchers propose that biological intelligence can be understood as a Multiscale Competency Architecture (MCA), where there are nested layers of adaptive, goal-directed behavior. At the lowest levels, weak policy optimization (i.e., simple trial-and-error learning) enables higher-level competencies and causal understanding to emerge. This multiscale causal learning is what gives rise to the flexible, intelligent behaviors we see in living things.

In contrast, by encoding human-designed abstractions directly into machine intelligence (like the arithmetic and logic of computer processors), we may be inadvertently disconnecting the high-level goal-directed behavior from the low-level adaptive mechanisms that gave rise to it in biological systems. This could be one reason why machine recall and reasoning often lacks the feeling and meaning that accompanies human cognition.

Technical Explanation

The paper proposes that biological intelligence uses a Multiscale Competency Architecture (MCA), where adaptive, goal-directed behavior emerges at all scales, from cells to organs to organisms. This is in contrast to machine intelligence, which is only adaptive and goal-directed at a high level.

The authors argue that biological intelligence excels at causal learning, while machine intelligence struggles with this due to the use of pre-defined abstractions (e.g., arithmetic) embodied in static interpreters (e.g., x86 processors). Previous work has shown that causal learning follows from weak policy optimization, which is hindered by these pre-supposed abstractions.

To understand how biological MCAs might enable causal learning, the researchers formalize them as nested agentic abstraction layers. They show that weak policy optimization at low levels enables weak policy optimization at higher levels, facilitating what they call multiscale causal learning and high-level goal-directed behavior.

The paper argues that by engineering human abstractions directly into machine intelligence (like the computational dualism of hardware and software), we may be disconnecting high-level goal-directed behavior from the low-level adaptive mechanisms that gave rise to it in biological systems. This, the authors speculate, could be one reason why human recall is accompanied by feeling and meaning, while machine recall is not.

Critical Analysis

The paper presents a thought-provoking perspective on the differences between biological and machine intelligence, and how the engineering of human abstractions into machine systems may be inhibiting the development of causal learning capabilities.

One potential limitation is that the formalization of MCAs and the mechanisms of multiscale causal learning are not fully developed or empirically validated in the paper. The authors acknowledge this and suggest that further research is needed to test their hypotheses and explore the specific architectural and learning principles that enable biological intelligence to excel at causal reasoning.

Additionally, the paper does not address the potential challenges of engineering machine intelligence systems that can truly emulate the flexibility and adaptability of biological cognition. Replicating the multi-scale, nested competencies of living systems in artificial constructs may require fundamental breakthroughs in our approaches to machine learning and computational architectures.

Overall, the paper raises important questions about the nature of intelligence and the limitations of current machine learning techniques. By highlighting the significance of causal learning and the potential downsides of over-engineering human abstractions into artificial systems, it encourages readers to think critically about the future of artificial intelligence and its relationship to biological cognition.

Conclusion

This paper proposes that the key difference between biological and machine intelligence lies in their respective approaches to causal learning and goal-directed behavior. Biological systems exhibit adaptive, goal-directed competencies at multiple scales, from cells to organisms, enabling them to excel at understanding the underlying causes and mechanisms in their environments.

In contrast, current machine intelligence is predominantly adaptive and goal-directed only at a high level, relying on pre-defined abstractions and static interpreters that hinder causal learning. The researchers argue that by formalizing biological intelligence as a Multiscale Competency Architecture (MCA), we can better understand how nested layers of adaptive, goal-directed behavior enable multiscale causal learning and flexible, intelligent behaviors.

The authors suggest that the engineering of human abstractions directly into machine intelligence may be disconnecting high-level goal-directed behavior from the low-level adaptive mechanisms that gave rise to it in biological systems. This could be one reason why machine recall and reasoning often lacks the feeling and meaning that accompanies human cognition.

Overall, this paper encourages a deeper exploration of the fundamental differences between biological and machine intelligence, with the goal of developing artificial systems that can more effectively learn about and adapt to the causal structure of the world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

👨‍🏫

Robust agents learn causal world models

Jonathan Richens, Tom Everitt

It has long been hypothesised that causal reasoning plays a fundamental role in robust and general intelligence. However, it is not known if agents must learn causal models in order to generalise to new domains, or if other inductive biases are sufficient. We answer this question, showing that any agent capable of satisfying a regret bound under a large set of distributional shifts must have learned an approximate causal model of the data generating process, which converges to the true causal model for optimal agents. We discuss the implications of this result for several research areas including transfer learning and causal inference.

4/10/2024

cs.AI cs.LG

🔮

Learning World Models With Hierarchical Temporal Abstractions: A Probabilistic Perspective

Vaisakh Shaj

Machines that can replicate human intelligence with type 2 reasoning capabilities should be able to reason at multiple levels of spatio-temporal abstractions and scales using internal world models. Devising formalisms to develop such internal world models, which accurately reflect the causal hierarchies inherent in the dynamics of the real world, is a critical research challenge in the domains of artificial intelligence and machine learning. This thesis identifies several limitations with the prevalent use of state space models (SSMs) as internal world models and propose two new probabilistic formalisms namely Hidden-Parameter SSMs and Multi-Time Scale SSMs to address these drawbacks. The structure of graphical models in both formalisms facilitates scalable exact probabilistic inference using belief propagation, as well as end-to-end learning via backpropagation through time. This approach permits the development of scalable, adaptive hierarchical world models capable of representing nonstationary dynamics across multiple temporal abstractions and scales. Moreover, these probabilistic formalisms integrate the concept of uncertainty in world states, thus improving the system's capacity to emulate the stochastic nature of the real world and quantify the confidence in its predictions. The thesis also discuss how these formalisms are in line with related neuroscience literature on Bayesian brain hypothesis and predicitive processing. Our experiments on various real and simulated robots demonstrate that our formalisms can match and in many cases exceed the performance of contemporary transformer variants in making long-range future predictions. We conclude the thesis by reflecting on the limitations of our current models and suggesting directions for future research.

4/29/2024

cs.AI cs.LG

Mechanistic Interpretability for AI Safety -- A Review

Leonard Bereska, Efstratios Gavves

Understanding AI systems' inner workings is critical for ensuring value alignment and safety. This review explores mechanistic interpretability: reverse-engineering the computational mechanisms and representations learned by neural networks into human-understandable algorithms and concepts to provide a granular, causal understanding. We establish foundational concepts such as features encoding knowledge within neural activations and hypotheses about their representation and computation. We survey methodologies for causally dissecting model behaviors and assess the relevance of mechanistic interpretability to AI safety. We investigate challenges surrounding scalability, automation, and comprehensive interpretation. We advocate for clarifying concepts, setting standards, and scaling techniques to handle complex models and behaviors and expand to domains such as vision and reinforcement learning. Mechanistic interpretability could help prevent catastrophic outcomes as AI systems become more powerful and inscrutable.

4/23/2024

cs.AI

🤿

Cellular automata, many-valued logic, and deep neural networks

Yani Zhang, Helmut Bolcskei

We develop a theory characterizing the fundamental capability of deep neural networks to learn, from evolution traces, the logical rules governing the behavior of cellular automata (CA). This is accomplished by first establishing a novel connection between CA and Lukasiewicz propositional logic. While binary CA have been known for decades to essentially perform operations in Boolean logic, no such relationship exists for general CA. We demonstrate that many-valued (MV) logic, specifically Lukasiewicz propositional logic, constitutes a suitable language for characterizing general CA as logical machines. This is done by interpolating CA transition functions to continuous piecewise linear functions, which, by virtue of the McNaughton theorem, yield formulae in MV logic characterizing the CA. Recognizing that deep rectified linear unit (ReLU) networks realize continuous piecewise linear functions, it follows that these formulae are naturally extracted from CA evolution traces by deep ReLU networks. A corresponding algorithm together with a software implementation is provided. Finally, we show that the dynamical behavior of CA can be realized by recurrent neural networks.

4/9/2024

cs.AI