Playing Games with your PET: Extending the Partial Exploration Tool to Stochastic Games

Read original: arXiv:2405.03885 - Published 5/14/2024 by Tobias Meggendorfer, Maximilian Weininger

➖

Overview

Presents version 2.0 of the Partial Exploration Tool (PET), a tool for verification of probabilistic systems
Extends the previous version by adding support for stochastic games, based on a recent unified framework for sound value iteration algorithms
PET2 is the first tool implementing a sound and efficient approach for solving stochastic games with objectives of the type reachability/safety and mean payoff
Develops and implements a partial-exploration based variant for all three objectives
Experimental evaluation shows PET2 offers the most efficient partial-exploration based algorithm and is the most viable tool on stochastic games, even outperforming unsound tools

Plain English Explanation

The researchers have developed an updated version of their Partial Exploration Tool (PET), which is used to verify the behavior of probabilistic systems. The new version, PET2, has been enhanced to support a type of system called a "stochastic game." Stochastic games are a mathematical model that can be used to represent situations where multiple decision-makers interact with each other in an environment with uncertain outcomes.

The key innovation in PET2 is that it is the first tool to implement a sound and efficient approach for solving stochastic games with certain types of objectives, such as reaching a particular state (reachability/safety) or maximizing the average payoff over time (mean payoff). This is important because stochastic games can be difficult to analyze, and previous tools either had limitations or produced unreliable results.

In addition to this core capability, the researchers have also developed a "partial exploration" variant of their algorithm. This means the tool can find solutions without having to exhaustively explore all possible scenarios, which can make it more efficient in practice.

The researchers have evaluated PET2 experimentally and found that it outperforms even unsound (potentially inaccurate) tools when it comes to analyzing stochastic games. This suggests PET2 could be a valuable tool for researchers and practitioners working with probabilistic systems and strategic decision-making problems.

Technical Explanation

The paper presents version 2.0 of the Partial Exploration Tool (PET), a tool for the verification of probabilistic systems. The key extension in PET2 is the addition of support for stochastic games, based on a recent unified framework for sound value iteration algorithms.

Stochastic games are a generalization of Markov decision processes that involve multiple decision-makers, each trying to optimize their own objective function. PET2 is the first tool to implement a sound and efficient approach for solving stochastic games with objectives of the type reachability/safety and mean payoff. The researchers develop and implement a partial-exploration based variant of their algorithm for all three objectives (reachability/safety, mean payoff, and the previously supported parity objectives).

The experimental evaluation shows that PET2 offers the most efficient partial-exploration based algorithm and is the most viable tool on stochastic games, even outperforming unsound tools. This is achieved through techniques like learning optimal deterministic policies and online minimax strategies for partially observable environments.

Critical Analysis

The paper presents a significant advance in the verification of probabilistic systems, particularly in the realm of stochastic games. The addition of support for stochastic games in PET2 is an important step forward, as these types of systems are difficult to analyze and previous tools had notable limitations.

One potential caveat is that the paper does not provide a detailed comparison of PET2's performance to other state-of-the-art tools for stochastic game analysis. While the researchers claim PET2 outperforms unsound tools, a more comprehensive benchmarking against other sound approaches would help strengthen the case for PET2's superiority.

Additionally, the paper does not delve into the potential limitations or failure modes of the partial exploration approach. It would be helpful to understand the scenarios where this technique may struggle or produce less reliable results, as well as any trade-offs involved in terms of accuracy or computation time.

Further research could also explore the applicability of PET2 to multi-agent training beyond zero-sum games or the integration of Bayesian optimization techniques to enhance the efficiency of the partial exploration algorithm.

Conclusion

The paper presents a significant advancement in the field of probabilistic system verification with the introduction of PET2, a tool that can effectively analyze stochastic games. PET2's sound and efficient approach to solving stochastic games with objectives like reachability/safety and mean payoff makes it a valuable tool for researchers and practitioners working in this domain.

The partial exploration variant of the algorithm also shows promise in terms of improving the practical applicability of the tool, though further research is needed to fully understand its limitations and potential enhancements. Overall, PET2 represents an important step forward in the verification of complex probabilistic systems and strategic decision-making problems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

➖

Playing Games with your PET: Extending the Partial Exploration Tool to Stochastic Games

Tobias Meggendorfer, Maximilian Weininger

We present version 2.0 of the Partial Exploration Tool (PET), a tool for verification of probabilistic systems. We extend the previous version by adding support for stochastic games, based on a recent unified framework for sound value iteration algorithms. Thereby, PET2 is the first tool implementing a sound and efficient approach for solving stochastic games with objectives of the type reachability/safety and mean payoff. We complement this approach by developing and implementing a partial-exploration based variant for all three objectives. Our experimental evaluation shows that PET2 offers the most efficient partial-exploration based algorithm and is the most viable tool on SGs, even outperforming unsound tools.

5/14/2024

🧠

Partially Observable Stochastic Games with Neural Perception Mechanisms

Rui Yan, Gabriel Santos, Gethin Norman, David Parker, Marta Kwiatkowska

Stochastic games are a well established model for multi-agent sequential decision making under uncertainty. In practical applications, though, agents often have only partial observability of their environment. Furthermore, agents increasingly perceive their environment using data-driven approaches such as neural networks trained on continuous data. We propose the model of neuro-symbolic partially-observable stochastic games (NS-POSGs), a variant of continuous-space concurrent stochastic games that explicitly incorporates neural perception mechanisms. We focus on a one-sided setting with a partially-informed agent using discrete, data-driven observations and another, fully-informed agent. We present a new method, called one-sided NS-HSVI, for approximate solution of one-sided NS-POSGs, which exploits the piecewise constant structure of the model. Using neural network pre-image analysis to construct finite polyhedral representations and particle-based representations for beliefs, we implement our approach and illustrate its practical applicability to the analysis of pedestrian-vehicle and pursuit-evasion scenarios.

7/2/2024

Stochastic Optimisation Framework using the Core Imaging Library and Synergistic Image Reconstruction Framework for PET Reconstruction

Evangelos Papoutsellis, Casper da Costa-Luis, Daniel Deidda, Claire Delplancke, Margaret Duff, Gemma Fardell, Ashley Gillman, Jakob S. J{o}rgensen, Zeljko Kereta, Evgueni Ovtchinnikov, Edoardo Pasca, Georg Schramm, Kris Thielemans

We introduce a stochastic framework into the open--source Core Imaging Library (CIL) which enables easy development of stochastic algorithms. Five such algorithms from the literature are developed, Stochastic Gradient Descent, Stochastic Average Gradient (-Am'elior'e), (Loopless) Stochastic Variance Reduced Gradient. We showcase the functionality of the framework with a comparative study against a deterministic algorithm on a simulated 2D PET dataset, with the use of the open-source Synergistic Image Reconstruction Framework. We observe that stochastic optimisation methods can converge in fewer passes of the data than a standard deterministic algorithm.

6/24/2024

Distributed Stackelberg Strategies in State-based Potential Games for Autonomous Decentralized Learning Manufacturing Systems

Steve Yuwono, Dorothea Schwung, Andreas Schwung

This article describes a novel game structure for autonomously optimizing decentralized manufacturing systems with multi-objective optimization challenges, namely Distributed Stackelberg Strategies in State-Based Potential Games (DS2-SbPG). DS2-SbPG integrates potential games and Stackelberg games, which improves the cooperative trade-off capabilities of potential games and the multi-objective optimization handling by Stackelberg games. Notably, all training procedures remain conducted in a fully distributed manner. DS2-SbPG offers a promising solution to finding optimal trade-offs between objectives by eliminating the complexities of setting up combined objective optimization functions for individual players in self-learning domains, particularly in real-world industrial settings with diverse and numerous objectives between the sub-systems. We further prove that DS2-SbPG constitutes a dynamic potential game that results in corresponding converge guarantees. Experimental validation conducted on a laboratory-scale testbed highlights the efficacy of DS2-SbPG and its two variants, such as DS2-SbPG for single-leader-follower and Stack DS2-SbPG for multi-leader-follower. The results show significant reductions in power consumption and improvements in overall performance, which signals the potential of DS2-SbPG in real-world applications.

8/14/2024