Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving

2401.15315

Published 6/19/2024 by Zhiyu Huang, Chen Tang, Chen Lv, Masayoshi Tomizuka, Wei Zhan

Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving

Abstract

Effective decision-making in autonomous driving relies on accurate inference of other traffic agents' future behaviors. To achieve this, we propose an online belief-update-based behavior prediction model and an efficient planner for Partially Observable Markov Decision Processes (POMDPs). We develop a Transformer-based prediction model, enhanced with a recurrent neural memory model, to dynamically update latent belief state and infer the intentions of other agents. The model can also integrate the ego vehicle's intentions to reflect closed-loop interactions among agents, and it learns from both offline data and online interactions. For planning, we employ a Monte-Carlo Tree Search (MCTS) planner with macro actions, which reduces computational complexity by searching over temporally extended action steps. Inside the MCTS planner, we use predicted long-term multi-modal trajectories to approximate future updates, which eliminates iterative belief updating and improves the running efficiency. Our approach also incorporates deep Q-learning (DQN) as a search prior, which significantly improves the performance of the MCTS planner. Experimental results from simulated environments validate the effectiveness of our proposed method. The online belief update model can significantly enhance the accuracy and temporal consistency of predictions, leading to improved decision-making performance. Employing DQN as a search prior in the MCTS planner considerably boosts its performance and outperforms an imitation learning-based prior. Additionally, we show that the MCTS planning with macro actions substantially outperforms the vanilla method in terms of performance and efficiency.

Create account to get full access

Overview

This paper presents a novel approach for efficient Partially Observable Markov Decision Process (POMDP) planning in autonomous driving scenarios, where the agent must reason about the uncertain beliefs of other actors.
The key idea is to use a learning-based model to predict the online belief updates of other agents, which can then be integrated into the planning process to improve decision-making.
The authors demonstrate the effectiveness of their approach through experiments in simulated autonomous driving environments, showing significant improvements in planning efficiency and solution quality compared to previous methods.

Plain English Explanation

In autonomous driving, the car's decision-making process needs to account for the uncertain beliefs and intentions of other vehicles and pedestrians on the road. This is a challenging problem that can be modeled using a Partially Observable Markov Decision Process (POMDP) framework, which considers the incomplete information available to the autonomous agent.

The authors of this paper propose a novel solution to this problem. Instead of relying solely on the autonomous agent's own observations and planning algorithms, they introduce a learning-based model that can predict how the beliefs of other agents are likely to evolve over time. This predicted belief information is then integrated into the POMDP planning process, allowing the autonomous agent to make more informed and efficient decisions.

The key insight is that by leveraging machine learning to model the belief dynamics of other agents, the autonomous agent can better anticipate their future actions and plan accordingly. This can lead to significant improvements in the overall planning efficiency and the quality of the solutions, as demonstrated in the authors' experiments.

Technical Explanation

The proposed approach, called "Learning Online Belief Prediction for Efficient POMDP Planning in Autonomous Driving," consists of two main components:

Online Belief Prediction Model: The authors train a deep learning model to predict the belief updates of other agents in the autonomous driving scenario. This model takes the current state of the environment and the agent's own belief as inputs, and outputs a prediction of how the beliefs of other agents are likely to change over time.
POMDP Planning with Predicted Beliefs: The predicted belief information from the learning model is then integrated into the POMDP planning process, allowing the autonomous agent to reason about the uncertain beliefs of other agents and make more informed decisions. The authors use a Monte-Carlo Tree Search (MCTS) algorithm for the POMDP planning, which is enhanced by the predicted belief information.

The authors demonstrate the effectiveness of their approach through experiments in simulated autonomous driving environments. The results show significant improvements in planning efficiency and solution quality compared to previous POMDP planning methods that do not consider the belief dynamics of other agents.

Critical Analysis

The authors acknowledge several limitations and areas for further research in their paper. For example, the belief prediction model is trained offline and may not generalize well to novel situations or environments. Additionally, the computational overhead of integrating the belief prediction model into the POMDP planning process could be a concern for real-time deployment.

It would also be interesting to see how the proposed approach compares to alternative methods for integrating learning-based components into motion planning, such as the "Quad-Query" approach. The authors could further explore the tradeoffs between solution quality, planning efficiency, and model complexity in their future work.

Conclusion

This paper presents a promising approach for efficient POMDP planning in autonomous driving scenarios, where the key innovation is the integration of a learning-based model for predicting the belief updates of other agents. By considering these predicted beliefs, the autonomous agent can make more informed decisions and improve the overall planning efficiency and solution quality.

The authors have demonstrated the effectiveness of their approach through extensive experiments, and the work has the potential to contribute to the ongoing development of safe and reliable autonomous driving systems. While there are some limitations that warrant further research, the core idea of leveraging learning-based models to enhance POMDP planning is a valuable contribution to the field.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Motion Planning under Uncertainty: Integrating Learning-Based Multi-Modal Predictors into Branch Model Predictive Control

Mohamed-Khalil Bouzidi, Bojan Derajic, Daniel Goehring, Joerg Reichardt

In complex traffic environments, autonomous vehicles face multi-modal uncertainty about other agents' future behavior. To address this, recent advancements in learningbased motion predictors output multi-modal predictions. We present our novel framework that leverages Branch Model Predictive Control(BMPC) to account for these predictions. The framework includes an online scenario-selection process guided by topology and collision risk criteria. This efficiently selects a minimal set of predictions, rendering the BMPC realtime capable. Additionally, we introduce an adaptive decision postponing strategy that delays the planner's commitment to a single scenario until the uncertainty is resolved. Our comprehensive evaluations in traffic intersection and random highway merging scenarios demonstrate enhanced comfort and safety through our method.

5/7/2024

cs.RO cs.SY eess.SY

Planning with Adaptive World Models for Autonomous Driving

Arun Balajee Vasudevan, Neehar Peri, Jeff Schneider, Deva Ramanan

Motion planning is crucial for safe navigation in complex urban environments. Historically, motion planners (MPs) have been evaluated with procedurally-generated simulators like CARLA. However, such synthetic benchmarks do not capture real-world multi-agent interactions. nuPlan, a recently released MP benchmark, addresses this limitation by augmenting real-world driving logs with closed-loop simulation logic, effectively turning the fixed dataset into a reactive simulator. We analyze the characteristics of nuPlan's recorded logs and find that each city has its own unique driving behaviors, suggesting that robust planners must adapt to different environments. We learn to model such unique behaviors with BehaviorNet, a graph convolutional neural network (GCNN) that predicts reactive agent behaviors using features derived from recently-observed agent histories; intuitively, some aggressive agents may tailgate lead vehicles, while others may not. To model such phenomena, BehaviorNet predicts parameters of an agent's motion controller rather than predicting its spacetime trajectory (as most forecasters do). Finally, we present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions. Our extensive experiments demonstrate that AdaptiveDriver achieves state-of-the-art results on the nuPlan closed-loop planning benchmark, reducing test error from 6.4% to 4.6%, even when applied to never-before-seen cities.

6/18/2024

cs.RO cs.LG

RACP: Risk-Aware Contingency Planning with Multi-Modal Predictions

Khaled A. Mustafa, Daniel Jarne Ornia, Jens Kober, Javier Alonso-Mora

For an autonomous vehicle to operate reliably within real-world traffic scenarios, it is imperative to assess the repercussions of its prospective actions by anticipating the uncertain intentions exhibited by other participants in the traffic environment. Driven by the pronounced multi-modal nature of human driving behavior, this paper presents an approach that leverages Bayesian beliefs over the distribution of potential policies of other road users to construct a novel risk-aware probabilistic motion planning framework. In particular, we propose a novel contingency planner that outputs long-term contingent plans conditioned on multiple possible intents for other actors in the traffic scene. The Bayesian belief is incorporated into the optimization cost function to influence the behavior of the short-term plan based on the likelihood of other agents' policies. Furthermore, a probabilistic risk metric is employed to fine-tune the balance between efficiency and robustness. Through a series of closed-loop safety-critical simulated traffic scenarios shared with human-driven vehicles, we demonstrate the practical efficacy of our proposed approach that can handle multi-vehicle scenarios.

6/21/2024

cs.RO

🔮

A Cognitive-Driven Trajectory Prediction Model for Autonomous Driving in Mixed Autonomy Environment

Haicheng Liao, Zhenning Li, Chengyue Wang, Bonan Wang, Hanlin Kong, Yanchen Guan, Guofa Li, Zhiyong Cui, Chengzhong Xu

As autonomous driving technology progresses, the need for precise trajectory prediction models becomes paramount. This paper introduces an innovative model that infuses cognitive insights into trajectory prediction, focusing on perceived safety and dynamic decision-making. Distinct from traditional approaches, our model excels in analyzing interactions and behavior patterns in mixed autonomy traffic scenarios. It represents a significant leap forward, achieving marked performance improvements on several key datasets. Specifically, it surpasses existing benchmarks with gains of 16.2% on the Next Generation Simulation (NGSIM), 27.4% on the Highway Drone (HighD), and 19.8% on the Macao Connected Autonomous Driving (MoCAD) dataset. Our proposed model shows exceptional proficiency in handling corner cases, essential for real-world applications. Moreover, its robustness is evident in scenarios with missing or limited data, outperforming most of the state-of-the-art baselines. This adaptability and resilience position our model as a viable tool for real-world autonomous driving systems, heralding a new standard in vehicle trajectory prediction for enhanced safety and efficiency.

4/29/2024

cs.RO