A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

Read original: arXiv:2405.10183 - Published 5/17/2024 by Matthew Andres Moreno, Anika Ranjan, Emily Dolson, Luis Zaman

A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

Overview

Explores techniques for tracking phylogenies (evolutionary relationships) in parallel and distributed agent-based evolution models
Introduces libraries and tools for in silico phylogenetic tracking
Discusses how phylogeny-informed interaction estimation can accelerate co-evolutionary processes
Describes a trackable island model genetic algorithm for parallel evolutionary simulations

Plain English Explanation

This paper presents methods for following the evolutionary history, or phylogeny, of populations in complex, distributed computer simulations of evolution. When running simulations of evolving systems with many interacting components, it can be challenging to keep track of how the different "species" or agents are related and how they are changing over time.

The researchers describe phylotrack, a set of C++ and Python libraries that make it easier to monitor the phylogenetic relationships in these kinds of simulations. They also show how understanding the phylogeny can help speed up the co-evolution of different agents, by informing how they interact. Additionally, they present a trackable island model genetic algorithm that allows parallel evolutionary simulations to be monitored at scale.

Overall, these techniques aim to provide researchers with better tools for studying the complex dynamics of evolving systems, whether they are modeling biological evolution, the co-evolution of machine learning algorithms, or other phenomena. Being able to track the phylogenetic relationships can lead to new insights and help accelerate the discovery process.

Technical Explanation

The paper introduces several methods for tracking phylogenies in parallel and distributed agent-based evolution models. First, the authors describe the phylotrack C++ and Python libraries, which provide capabilities for in silico phylogenetic tracking. These libraries allow researchers to monitor the evolutionary relationships between agents in their simulations.

The paper also discusses how phylogeny-informed interaction estimation can be used to accelerate co-evolutionary processes. By incorporating information about the phylogenetic history of the agents, the researchers demonstrate that the co-evolution of different components can be sped up.

Additionally, the authors present a trackable island model genetic algorithm that enables parallel evolutionary simulations to be tracked at scale. This approach allows for the monitoring of phylogenetic relationships in large-scale, distributed evolution models, which can be useful for studying the dynamics of complex adaptive systems.

Finally, the paper touches on how these techniques could be applied to inferring phylogenies of large language models, highlighting the potential for these methods to be used in a variety of domains beyond just biological evolution.

Critical Analysis

The paper presents a comprehensive set of methods for tracking phylogenies in parallel and distributed agent-based evolution models, which is a important capability for researchers studying complex adaptive systems. The authors demonstrate the utility of these techniques through several real-world examples and use cases.

One potential limitation of the work is the computational complexity involved in monitoring phylogenetic relationships, especially in large-scale simulations. The researchers acknowledge this challenge and discuss strategies for mitigating it, such as the use of parallel processing and efficient data structures. However, the scalability of these methods may still be an area for further research and optimization.

Additionally, the paper does not delve deeply into the potential biases or limitations of the phylogenetic inference algorithms used in these methods. As with any phylogenetic analysis, there may be underlying assumptions or uncertainties that could impact the interpretation of the results. Further investigation into the robustness and reliability of the phylogenetic tracking techniques would be valuable.

Overall, this paper makes a significant contribution to the field of agent-based modeling and evolutionary computation by providing a set of practical tools and techniques for researchers to better understand the dynamics of their evolving systems. The ability to track phylogenies can lead to new insights and accelerate the discovery of complex patterns and relationships.

Conclusion

This paper introduces a suite of methods and tools for tracking phylogenies in parallel and distributed agent-based evolution models. By providing capabilities for in silico phylogenetic tracking, phylogeny-informed interaction estimation, and parallel evolutionary simulations, the researchers have developed a comprehensive set of techniques to help researchers better understand the complex dynamics of evolving systems.

These advances have implications for a wide range of domains, from biological evolution to the co-evolution of machine learning algorithms. The ability to monitor the phylogenetic relationships between agents can lead to new insights and accelerate the discovery process, ultimately contributing to our understanding of complex adaptive systems.

While the computational complexity of these methods may present some challenges, the researchers have addressed this issue and provided strategies for mitigating it. As the field of agent-based modeling and evolutionary computation continues to evolve, these techniques will likely become increasingly valuable tools for researchers seeking to unravel the mysteries of emergent phenomena in silico.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

Matthew Andres Moreno, Anika Ranjan, Emily Dolson, Luis Zaman

Computer simulations are an important tool for studying the mechanics of biological evolution. In particular, in silico work with agent-based models provides an opportunity to collect high-quality records of ancestry relationships among simulated agents. Such phylogenies can provide insight into evolutionary dynamics within these simulations. Existing work generally tracks lineages directly, yielding an exact phylogenetic record of evolutionary history. However, direct tracking can be inefficient for large-scale, many-processor evolutionary simulations. An alternate approach to extracting phylogenetic information from simulation that scales more favorably is post hoc estimation, akin to how bioinformaticians build phylogenies by assessing genetic similarities between organisms. Recently introduced ``hereditary stratigraphy'' algorithms provide means for efficient inference of phylogenetic history from non-coding annotations on simulated organisms' genomes. A number of options exist in configuring hereditary stratigraphy methodology, but no work has yet tested how they impact reconstruction quality. To address this question, we surveyed reconstruction accuracy under alternate configurations across a matrix of evolutionary conditions varying in selection pressure, spatial structure, and ecological dynamics. We synthesize results from these experiments to suggest a prescriptive system of best practices for work with hereditary stratigraphy, ultimately guiding researchers in choosing appropriate instrumentation for large-scale simulation studies.

5/17/2024

Trackable Agent-based Evolution Models at Wafer Scale

Matthew Andres Moreno, Connor Yang, Emily Dolson, Luis Zaman

Continuing improvements in computing hardware are poised to transform capabilities for in silico modeling of cross-scale phenomena underlying major open questions in evolutionary biology and artificial life, such as transitions in individuality, eco-evolutionary dynamics, and rare evolutionary events. Emerging ML/AI-oriented hardware accelerators, like the 850,000 processor Cerebras Wafer Scale Engine (WSE), hold particular promise. However, practical challenges remain in conducting informative evolution experiments that efficiently utilize these platforms' large processor counts. Here, we focus on the problem of extracting phylogenetic information from agent-based evolution on the WSE platform. This goal drove significant refinements to decentralized in silico phylogenetic tracking, reported here. These improvements yield order-of-magnitude performance improvements. We also present an asynchronous island-based genetic algorithm (GA) framework for WSE hardware. Emulated and on-hardware GA benchmarks with a simple tracking-enabled agent model clock upwards of 1 million generations a minute for population sizes reaching 16 million agents. We validate phylogenetic reconstructions from these trials and demonstrate their suitability for inference of underlying evolutionary conditions. In particular, we demonstrate extraction, from wafer-scale simulation, of clear phylometric signals that differentiate runs with adaptive dynamics enabled versus disabled. Together, these benchmark and validation trials reflect strong potential for highly scalable agent-based evolution simulation that is both efficient and observable. Developed capabilities will bring entirely new classes of previously intractable research questions within reach, benefiting further explorations within the evolutionary biology and artificial life communities across a variety of emerging high-performance computing platforms.

6/4/2024

🐍

Phylotrack: C++ and Python libraries for in silico phylogenetic tracking

Emily Dolson, Santiago Rodriguez-Papa, Matthew Andres Moreno

In silico evolution instantiates the processes of heredity, variation, and differential reproductive success (the three ingredients for evolution by natural selection) within digital populations of computational agents. Consequently, these populations undergo evolution, and can be used as virtual model systems for studying evolutionary dynamics. This experimental paradigm -- used across biological modeling, artificial life, and evolutionary computation -- complements research done using in vitro and in vivo systems by enabling experiments that would be impossible in the lab or field. One key benefit is complete, exact observability. For example, it is possible to perfectly record all parent-child relationships across simulation history, yielding complete phylogenies (ancestry trees). This information reveals when traits were gained or lost, and also facilitates inference of underlying evolutionary dynamics. The Phylotrack project provides libraries for tracking and analyzing phylogenies in in silico evolution. The project is composed of 1) Phylotracklib: a header-only C++ library, developed under the umbrella of the Empirical project, and 2) Phylotrackpy: a Python wrapper around Phylotracklib, created with Pybind11. Both components supply a public-facing API to attach phylogenetic tracking to digital evolution systems, as well as a stand-alone interface for measuring a variety of popular phylogenetic topology metrics. Underlying design and C++ implementation prioritizes efficiency, allowing for fast generational turnover for agent populations numbering in the tens of thousands. Several explicit features (e.g., phylogeny pruning and abstraction, etc.) are provided for reducing the memory footprint of phylogenetic information.

7/18/2024

Phylogeny-Informed Interaction Estimation Accelerates Co-Evolutionary Learning

Jack Garbus, Thomas Willkens, Alexander Lalejini, Jordan Pollack

Co-evolution is a powerful problem-solving approach. However, fitness evaluation in co-evolutionary algorithms can be computationally expensive, as the quality of an individual in one population is defined by its interactions with many (or all) members of one or more other populations. To accelerate co-evolutionary systems, we introduce phylogeny-informed interaction estimation, which uses runtime phylogenetic analysis to estimate interaction outcomes between individuals based on how their relatives performed against each other. We test our interaction estimation method with three distinct co-evolutionary systems: two systems focused on measuring problem-solving success and one focused on measuring evolutionary open-endedness. We find that phylogeny-informed estimation can substantially reduce the computation required to solve problems, particularly at the beginning of long-term evolutionary runs. Additionally, we find that our estimation method initially jump-starts the evolution of neural complexity in our open-ended domain, but estimation-free systems eventually catch-up if given enough time. More broadly, continued refinements to these phylogeny-informed interaction estimation methods offers a promising path to reducing the computational cost of running co-evolutionary systems while maintaining their open-endedness.

4/11/2024