Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks

Read original: arXiv:2408.10556 - Published 8/21/2024 by Yun Qu, Boyuan Wang, Jianzhun Shao, Yuhang Jiang, Chen Chen, Zhenbin Ye, Lin Liu, Junfeng Yang, Lin Lai, Hongyang Qin and 8 others

Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks

Overview

The paper introduces Hokoff, a real-game dataset from the popular mobile game Honor of Kings, and its application as an offline reinforcement learning benchmark.
Hokoff provides a large-scale, high-fidelity dataset of human gameplay to enable the development and evaluation of advanced AI agents for complex, multi-agent games.
The paper also presents several novel offline reinforcement learning benchmarks based on Hokoff, along with baseline results, to spur further research in this important area.

Plain English Explanation

The researchers have created a new dataset called Hokoff that contains real gameplay data from the mobile game Honor of Kings. This game is very popular, and the dataset provides a large amount of high-quality information about how humans play the game.

The researchers believe this dataset can be very useful for developing and testing advanced AI agents that can play complex, multi-player games like Honor of Kings. By learning from the real gameplay data in Hokoff, AI systems might be able to play these types of games better.

In addition to the dataset, the researchers have also created some new "offline reinforcement learning" benchmarks that use the Hokoff data. These benchmarks provide a way to measure how well AI systems can learn to play the game without having to interact with the game environment directly. The researchers provide some initial baseline results on these benchmarks to help spur further research in this area.

Technical Explanation

The paper introduces Hokoff, a large-scale, high-fidelity dataset of real gameplay from the popular mobile game Honor of Kings. Hokoff consists of over 1 million gameplay episodes, each containing rich information about the actions, observations, and rewards experienced by players during the course of a match.

The researchers present several novel offline reinforcement learning benchmarks based on the Hokoff dataset. These benchmarks are designed to evaluate an AI agent's ability to learn effective strategies for the game purely from the offline dataset, without any direct interaction with the live game environment. The benchmarks cover different aspects of the game, such as macro-level decision making, micro-level control, and multi-agent coordination.

The paper also provides baseline results for these benchmarks, using a variety of state-of-the-art offline RL algorithms. These results serve as a starting point for further research and development of advanced AI agents that can excel at complex, multi-player games like Honor of Kings.

Critical Analysis

The Hokoff dataset and accompanying benchmarks represent an important contribution to the field of offline reinforcement learning. By providing a large-scale, high-fidelity dataset of real human gameplay, the researchers have created a valuable resource for developing and evaluating AI agents that can learn effective strategies without the need for direct interaction with the game environment.

However, the paper does not fully address the potential limitations and challenges of using offline data for training RL agents. For example, the dataset may not capture the full complexity of the game, and the agents may struggle to extrapolate from the observed behaviors to novel situations. Additionally, the paper does not discuss potential biases or distribution shifts in the dataset that could affect the performance of the trained agents.

Further research is needed to explore these issues and to push the boundaries of what is possible with offline RL in the context of complex, multi-agent games. The Hokoff dataset and benchmarks provide a solid foundation for this work, but continued innovation and critical analysis will be necessary to unlock the full potential of this approach.

Conclusion

The Hokoff dataset and offline RL benchmarks introduced in this paper represent an important step forward in the development of advanced AI agents for complex, multi-player games. By leveraging real-world gameplay data, the researchers have created a valuable resource for the research community to build upon.

The potential impact of this work extends beyond the specific domain of game AI, as the techniques and insights developed here could also be applied to other areas of reinforcement learning and multi-agent systems. As the field continues to evolve, the Hokoff dataset and benchmarks will likely play a key role in driving forward the state of the art in this important area of research.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks

Yun Qu, Boyuan Wang, Jianzhun Shao, Yuhang Jiang, Chen Chen, Zhenbin Ye, Lin Liu, Junfeng Yang, Lin Lai, Hongyang Qin, Minwen Deng, Juchao Zhuo, Deheng Ye, Qiang Fu, Wei Yang, Guang Yang, Lanxiao Huang, Xiangyang Ji

The advancement of Offline Reinforcement Learning (RL) and Offline Multi-Agent Reinforcement Learning (MARL) critically depends on the availability of high-quality, pre-collected offline datasets that represent real-world complexities and practical applications. However, existing datasets often fall short in their simplicity and lack of realism. To address this gap, we propose Hokoff, a comprehensive set of pre-collected datasets that covers both offline RL and offline MARL, accompanied by a robust framework, to facilitate further research. This data is derived from Honor of Kings, a recognized Multiplayer Online Battle Arena (MOBA) game known for its intricate nature, closely resembling real-life situations. Utilizing this framework, we benchmark a variety of offline RL and offline MARL algorithms. We also introduce a novel baseline algorithm tailored for the inherent hierarchical action space of the game. We reveal the incompetency of current offline RL approaches in handling task complexity, generalization and multi-task learning.

8/21/2024

D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning

Rafael Rafailov, Kyle Hatch, Anikait Singh, Laura Smith, Aviral Kumar, Ilya Kostrikov, Philippe Hansen-Estruch, Victor Kolev, Philip Ball, Jiajun Wu, Chelsea Finn, Sergey Levine

Offline reinforcement learning algorithms hold the promise of enabling data-driven RL methods that do not require costly or dangerous real-world exploration and benefit from large pre-collected datasets. This in turn can facilitate real-world applications, as well as a more standardized approach to RL research. Furthermore, offline RL methods can provide effective initializations for online finetuning to overcome challenges with exploration. However, evaluating progress on offline RL algorithms requires effective and challenging benchmarks that capture properties of real-world tasks, provide a range of task difficulties, and cover a range of challenges both in terms of the parameters of the domain (e.g., length of the horizon, sparsity of rewards) and the parameters of the data (e.g., narrow demonstration data or broad exploratory data). While considerable progress in offline RL in recent years has been enabled by simpler benchmark tasks, the most widely used datasets are increasingly saturating in performance and may fail to reflect properties of realistic tasks. We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments, based on models of real-world robotic systems, and comprising a variety of data sources, including scripted data, play-style data collected by human teleoperators, and other data sources. Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation, with some of the tasks specifically designed to require both pre-training and fine-tuning. We hope that our proposed benchmark will facilitate further progress on both offline RL and fine-tuning algorithms. Website with code, examples, tasks, and data is available at url{https://sites.google.com/view/d5rl/}

8/19/2024

Putting Data at the Centre of Offline Multi-Agent Reinforcement Learning

Claude Formanek, Louise Beyers, Callum Rhys Tilbury, Jonathan P. Shock, Arnu Pretorius

Offline multi-agent reinforcement learning (MARL) is an exciting direction of research that uses static datasets to find optimal control policies for multi-agent systems. Though the field is by definition data-driven, efforts have thus far neglected data in their drive to achieve state-of-the-art results. We first substantiate this claim by surveying the literature, showing how the majority of works generate their own datasets without consistent methodology and provide sparse information about the characteristics of these datasets. We then show why neglecting the nature of the data is problematic, through salient examples of how tightly algorithmic performance is coupled to the dataset used, necessitating a common foundation for experiments in the field. In response, we take a big step towards improving data usage and data awareness in offline MARL, with three key contributions: (1) a clear guideline for generating novel datasets; (2) a standardisation of over 80 existing datasets, hosted in a publicly available repository, using a consistent storage format and easy-to-use API; and (3) a suite of analysis tools that allow us to understand these datasets better, aiding further development.

9/19/2024

Mini Honor of Kings: A Lightweight Environment for Multi-Agent Reinforcement Learning

Lin Liu, Jian Zhao, Cheng Hu, Zhengtao Cao, Youpeng Zhao, Zhenbin Ye, Meng Meng, Wenjun Wang, Zhaofeng He, Houqiang Li, Xia Lin, Lanxiao Huang

Games are widely used as research environments for multi-agent reinforcement learning (MARL), but they pose three significant challenges: limited customization, high computational demands, and oversimplification. To address these issues, we introduce the first publicly available map editor for the popular mobile game Honor of Kings and design a lightweight environment, Mini Honor of Kings (Mini HoK), for researchers to conduct experiments. Mini HoK is highly efficient, allowing experiments to be run on personal PCs or laptops while still presenting sufficient challenges for existing MARL algorithms. We have tested our environment on common MARL algorithms and demonstrated that these algorithms have yet to find optimal solutions within this environment. This facilitates the dissemination and advancement of MARL methods within the research community. Additionally, we hope that more researchers will leverage the Honor of Kings map editor to develop innovative and scientifically valuable new maps. Our code and user manual are available at: https://github.com/tencent-ailab/mini-hok.

6/18/2024