AI-Olympics: Exploring the Generalization of Agents through Open Competitions

Read original: arXiv:2405.14358 - Published 5/24/2024 by Chen Wang, Yan Song, Shuai Wu, Sa Wu, Ruizhi Zhang, Shu Lin, Haifeng Zhang

👀

Overview

Between 2021 and 2023, AI-Olympics, a series of online AI competitions, was hosted by the online evaluation platform Jidi in collaboration with the IJCAI committee.
In these competitions, an agent is required to accomplish diverse sports tasks in a two-dimensional continuous world, while competing against an opponent.
This paper provides a brief overview of the competition series and highlights notable findings.
The researchers aim to contribute insights to the field of multi-agent decision-making and explore the generalization of agents through engineering efforts.

Plain English Explanation

The paper discusses a series of online AI competitions called AI-Olympics that took place between 2021 and 2023. These competitions were organized by the online evaluation platform Jidi in collaboration with the IJCAI (International Joint Conferences on Artificial Intelligence) committee.

In these competitions, an AI agent was tasked with accomplishing various sports-related activities in a two-dimensional virtual environment. The agent had to compete against an opponent while performing these tasks. The researchers behind this paper wanted to provide an overview of the competition series and highlight some of the notable findings from these events.

The key goals of the researchers were to gain insights into the field of multi-agent decision-making, where multiple AI agents have to interact and coordinate with each other, and to explore how well these agents can generalize their skills to different tasks through engineering efforts.

Technical Explanation

The paper describes the AI-Olympics competition series, which aimed to assess the capabilities of AI agents in a multi-agent setting. In these competitions, the agents were required to complete a variety of sports-related tasks, such as running, jumping, and throwing, in a continuous two-dimensional virtual environment while competing against an opponent agent.

The researchers used the Jidi online evaluation platform to host the competitions and collected data on the agents' performance. By analyzing the results, they sought to gain insights into the challenges and strategies involved in multi-agent decision-making, where agents must coordinate their actions and adapt to their opponents' behavior.

Additionally, the researchers were interested in exploring the generalization of the agents' skills, that is, their ability to apply their learned capabilities to new, unseen tasks. This was done through various engineering efforts, such as designing the competition environments and tasks to encourage the development of transferable skills.

Critical Analysis

The paper provides a high-level overview of the AI-Olympics competition series, but it does not delve into the specific details of the experimental design, the agents' architectures, or the detailed findings from the competitions. While the researchers mention their goals of contributing to multi-agent decision-making and exploring agent generalization, the paper lacks a more in-depth analysis of the insights gained and the implications for the field.

One potential limitation of the research is the use of a two-dimensional continuous environment, which may not fully capture the complexity of real-world multi-agent scenarios. It would be interesting to see how the agents' performance and decision-making strategies would translate to more realistic three-dimensional environments, as discussed in the paper on embodied generalist agents in 3D worlds.

Additionally, the paper does not address the potential challenges in designing skill-compatible AI methodologies and frameworks, as explored in the research on designing skill-compatible AI methodologies and frameworks for chess. Examining these aspects could provide a more comprehensive understanding of the factors involved in developing versatile and adaptable AI agents.

Conclusion

The AI-Olympics competition series provided a platform for researchers to explore the capabilities of AI agents in a multi-agent setting, where they were required to complete various sports-related tasks while competing against an opponent. The findings from this research contribute to the understanding of multi-agent decision-making and the generalization of agent skills through engineering efforts.

While the paper offers a high-level overview of the competition series, further research is needed to delve deeper into the specific insights gained, the implications for the field, and the potential challenges in developing versatile and adaptable AI agents, as discussed in related research on autonomous evaluation and refinement of digital agents, AI agents for biomedical discovery, and AI agents in the 8th AI City Challenge.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

👀

AI-Olympics: Exploring the Generalization of Agents through Open Competitions

Chen Wang, Yan Song, Shuai Wu, Sa Wu, Ruizhi Zhang, Shu Lin, Haifeng Zhang

Between 2021 and 2023, AI-Olympics, a series of online AI competitions was hosted by the online evaluation platform Jidi in collaboration with the IJCAI committee. In these competitions, an agent is required to accomplish diverse sports tasks in a two-dimensional continuous world, while competing against an opponent. This paper provides a brief overview of the competition series and highlights notable findings. We aim to contribute insights to the field of multi-agent decision-making and explore the generalization of agents through engineering efforts.

5/24/2024

AI Olympics challenge with Evolutionary Soft Actor Critic

Marco Cal`i, Alberto Sinigaglia, Niccol`o Turcato, Ruggero Carli, Gian Antonio Susto

In the following report, we describe the solution we propose for the AI Olympics competition held at IROS 2024. Our solution is based on a Model-free Deep Reinforcement Learning approach combined with an evolutionary strategy. We will briefly describe the algorithms that have been used and then provide details of the approach

9/4/2024

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang, Dahua Lin, Yu Qiao, Pengfei Liu

The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i.e., AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoning abilities, we introduce OlympicArena, which includes 11,163 bilingual problems across both text-only and interleaved text-image modalities. These challenges encompass a wide range of disciplines spanning seven fields and 62 international Olympic competitions, rigorously examined for data leakage. We argue that the challenges in Olympic competition problems are ideal for evaluating AI's cognitive reasoning due to their complexity and interdisciplinary nature, which are essential for tackling complex scientific challenges and facilitating discoveries. Beyond evaluating performance across various disciplines using answer-only criteria, we conduct detailed experiments and analyses from multiple perspectives. We delve into the models' cognitive reasoning abilities, their performance across different modalities, and their outcomes in process-level evaluations, which are vital for tasks requiring complex reasoning with lengthy solutions. Our extensive evaluations reveal that even advanced models like GPT-4o only achieve a 39.97% overall accuracy, illustrating current AI limitations in complex reasoning and multimodal integration. Through the OlympicArena, we aim to advance AI towards superintelligence, equipping it to address more complex challenges in science and beyond. We also provide a comprehensive set of resources to support AI research, including a benchmark dataset, an open-source annotation platform, a detailed evaluation tool, and a leaderboard with automatic submission features.

6/19/2024

The Overcooked Generalisation Challenge

Constantin Ruhdorfer, Matteo Bortoletto, Anna Penzkofer, Andreas Bulling

We introduce the Overcooked Generalisation Challenge (OGC) - the first benchmark to study agents' zero-shot cooperation abilities when faced with novel partners and levels in the Overcooked-AI environment. This perspective starkly contrasts a large body of previous work that has trained and evaluated cooperating agents only on the same level, failing to capture generalisation abilities required for real-world human-AI cooperation. Our challenge interfaces with state-of-the-art dual curriculum design (DCD) methods to generate auto-curricula for training general agents in Overcooked. It is the first cooperative multi-agent environment specially designed for DCD methods and, consequently, the first benchmarked with state-of-the-art methods. It is fully GPU-accelerated, built on the DCD benchmark suite minimax, and freely available under an open-source license: https://git.hcics.simtech.uni-stuttgart.de/public-projects/OGC. We show that current DCD algorithms struggle to produce useful policies in this novel challenge, even if combined with recent network architectures that were designed for scalability and generalisability. The OGC pushes the boundaries of real-world human-AI cooperation by enabling the research community to study the impact of generalisation on cooperating agents.

6/27/2024