Reinforcement Learning with Latent State Inference for Autonomous On-ramp Merging under Observation Delay

2403.11852

Published 6/24/2024 by Amin Tabrizian, Zhitong Huang, Peng Wei

Reinforcement Learning with Latent State Inference for Autonomous On-ramp Merging under Observation Delay

Abstract

This paper presents a novel approach to address the challenging problem of autonomous on-ramp merging, where a self-driving vehicle needs to seamlessly integrate into a flow of vehicles on a multi-lane highway. We introduce the Lane-keeping, Lane-changing with Latent-state Inference and Safety Controller (L3IS) agent, designed to perform the on-ramp merging task safely without comprehensive knowledge about surrounding vehicles' intents or driving styles. We also present an augmentation of this agent called AL3IS that accounts for observation delays, allowing the agent to make more robust decisions in real-world environments with vehicle-to-vehicle (V2V) communication delays. By modeling the unobservable aspects of the environment through latent states, such as other drivers' intents, our approach enhances the agent's ability to adapt to dynamic traffic conditions, optimize merging maneuvers, and ensure safe interactions with other vehicles. We demonstrate the effectiveness of our method through extensive simulations generated from real traffic data and compare its performance with existing approaches. L3IS shows a 99.90% success rate in a challenging on-ramp merging case generated from the real US Highway 101 data. We further perform a sensitivity analysis on AL3IS to evaluate its robustness against varying observation delays, which demonstrates an acceptable performance of 93.84% success rate in 1-second V2V communication delay.

Create account to get full access

Overview

This paper presents a reinforcement learning (RL) approach to enable autonomous on-ramp merging under observation delay.
The key idea is to infer the latent state of the environment, which includes the intentions and behaviors of other vehicles, to guide the decision-making of the autonomous vehicle.
The proposed method, called Reinforcement Learning with Latent State Inference for Autonomous On-ramp Merging (LASIL), is designed to work in scenarios with delayed observations, which can be common in real-world driving situations.

Plain English Explanation

The paper focuses on the challenge of enabling autonomous vehicles to merge safely onto a highway from an on-ramp, even when there is a delay in the vehicle's ability to observe the surrounding environment. This is a common problem that autonomous vehicles must be able to handle, as the ability to merge seamlessly with other traffic is crucial for safe and efficient operation.

The researchers propose a Reinforcement Learning with Latent State Inference for Autonomous On-ramp Merging (LASIL) approach, which uses reinforcement learning to train the autonomous vehicle to make merging decisions. The key innovation is that the system also tries to infer the "latent state" of the environment, which includes the intentions and behaviors of the other vehicles on the road. This helps the autonomous vehicle anticipate the actions of other drivers and make better merging decisions, even when its own observations of the environment are delayed.

By accounting for the latent state of the environment, the LASIL approach aims to enable the autonomous vehicle to make more informed and safer merging decisions, even in challenging real-world situations where there are delays in the vehicle's ability to perceive its surroundings. This research builds on previous work on using latent state estimation to improve the decision-making of autonomous agents, as well as research on modeling lane change reactions and enhancing end-to-end autonomous driving using latent state information.

Technical Explanation

The LASIL approach combines reinforcement learning with a latent state inference module to enable autonomous on-ramp merging in the presence of observation delays. The key components of the system are:

Environment Model: The researchers develop a simulation environment that models the dynamics of an on-ramp merging scenario, including the behavior of other vehicles on the highway and the on-ramp.
Latent State Inference: The system includes a module that infers the latent state of the environment, which includes the intentions and behaviors of other vehicles. This is done by training a neural network to predict the future actions of nearby vehicles based on their current state and the autonomous vehicle's own actions.
Reinforcement Learning: The autonomous vehicle's merging policy is learned using reinforcement learning, where the agent is rewarded for safely and efficiently merging onto the highway. The latent state information is used to guide the agent's decision-making process.

The researchers evaluate the performance of the LASIL approach in the simulated environment, comparing it to baseline methods that do not use latent state inference. The results show that the LASIL approach outperforms the baselines, particularly in scenarios with longer observation delays, demonstrating the benefits of accounting for the latent state of the environment when making autonomous merging decisions.

Critical Analysis

The paper presents a promising approach to address the challenge of autonomous on-ramp merging under observation delay, which is a crucial capability for real-world autonomous driving. The use of latent state inference to guide the reinforcement learning agent's decision-making is a novel and well-motivated idea, building on previous research in this area.

However, the paper does not provide a thorough discussion of the limitations and potential issues with the proposed approach. For example, the reliance on a simulated environment may raise questions about the transferability of the approach to real-world scenarios, where the dynamics and behaviors of other vehicles may be more complex and unpredictable. Additionally, the paper does not explore the potential impact of errors in the latent state estimation on the overall system performance, which could be an important consideration.

Further research could also investigate the scalability of the LASIL approach to more complex driving environments, as well as its robustness to different types of observation delays and sensor failures. Exploring the integration of LASIL with other decision-making frameworks, such as those used in the HighwayLLM project, could also be a fruitful avenue for future work.

Conclusion

The Reinforcement Learning with Latent State Inference for Autonomous On-ramp Merging (LASIL) approach presented in this paper represents a promising step forward in enabling autonomous vehicles to safely and efficiently merge onto highways, even in the presence of observation delays. By leveraging latent state inference to guide the reinforcement learning agent's decision-making, the system aims to anticipate the behavior of other vehicles and make more informed merging decisions.

While the paper demonstrates the potential benefits of this approach in a simulated environment, further research is needed to address the limitations and explore the integration of LASIL with other decision-making frameworks for autonomous driving. Nevertheless, this work contributes to the ongoing efforts to develop robust and reliable autonomous driving systems that can navigate complex real-world scenarios.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation

Ke Guo, Zhenwei Miao, Wei Jing, Weiwei Liu, Weizi Li, Dayang Hao, Jia Pan

Microscopic traffic simulation plays a crucial role in transportation engineering by providing insights into individual vehicle behavior and overall traffic flow. However, creating a realistic simulator that accurately replicates human driving behaviors in various traffic conditions presents significant challenges. Traditional simulators relying on heuristic models often fail to deliver accurate simulations due to the complexity of real-world traffic environments. Due to the covariate shift issue, existing imitation learning-based simulators often fail to generate stable long-term simulations. In this paper, we propose a novel approach called learner-aware supervised imitation learning to address the covariate shift problem in multi-agent imitation learning. By leveraging a variational autoencoder simultaneously modeling the expert and learner state distribution, our approach augments expert states such that the augmented state is aware of learner state distribution. Our method, applied to urban traffic simulation, demonstrates significant improvements over existing state-of-the-art baselines in both short-term microscopic and long-term macroscopic realism when evaluated on the real-world dataset pNEUMA.

5/24/2024

cs.AI cs.LG

Modeling the Lane-Change Reactions to Merging Vehicles for Highway On-Ramp Simulations

Dustin Holley, Jovin Dsa, Hossein Nourkhiz Mahjoub, Gibran Ali, Tyler Naes, Ehsan Moradi-Pari, Pawan Sai Kallepalli

Enhancing simulation environments to replicate real-world driver behavior is essential for developing Autonomous Vehicle technology. While some previous works have studied the yielding reaction of lag vehicles in response to a merging car at highway on-ramps, the possible lane-change reaction of the lag car has not been widely studied. In this work we aim to improve the simulation of the highway merge scenario by including the lane-change reaction in addition to yielding behavior of main-lane lag vehicles, and we evaluate two different models for their ability to capture this reactive lane-change behavior. To tune the payoff functions of these models, a novel naturalistic dataset was collected on U.S. highways that provided several hours of merge-specific data to learn the lane change behavior of U.S. drivers. To make sure that we are collecting a representative set of different U.S. highway geometries in our data, we surveyed 50,000 U.S. highway on-ramps and then selected eight representative sites. The data were collected using roadside-mounted lidar sensors to capture various merge driver interactions. The models were demonstrated to be configurable for both keep-straight and lane-change behavior. The models were finally integrated into a high-fidelity simulation environment and confirmed to have adequate computation time efficiency for use in large-scale simulations to support autonomous vehicle development.

4/16/2024

cs.RO cs.SY eess.SY

Latent State Estimation Helps UI Agents to Reason

William E Bishop, Alice Li, Christopher Rawles, Oriana Riva

A common problem for agents operating in real-world environments is that the response of an environment to their actions may be non-deterministic and observed through noise. This renders environmental state and progress towards completing a task latent. Despite recent impressive demonstrations of LLM's reasoning abilities on various benchmarks, whether LLMs can build estimates of latent state and leverage them for reasoning has not been explicitly studied. We investigate this problem in the real-world domain of autonomous UI agents. We establish that appropriately prompting LLMs in a zero-shot manner can be formally understood as forming point estimates of latent state in a textual space. In the context of autonomous UI agents we then show that LLMs used in this manner are more than $76%$ accurate at inferring various aspects of latent state, such as performed (vs. commanded) actions and task progression. Using both public and internal benchmarks and three reasoning methods (zero-shot, CoT-SC & ReAct), we show that LLM-powered agents that explicitly estimate and reason about latent state are able to successfully complete up to 1.6x more tasks than those that do not.

5/21/2024

cs.AI cs.LG

Enhancing End-to-End Autonomous Driving with Latent World Model

Yingyan Li, Lue Fan, Jiawei He, Yuqi Wang, Yuntao Chen, Zhaoxiang Zhang, Tieniu Tan

End-to-end autonomous driving has garnered widespread attention. Current end-to-end approaches largely rely on the supervision from perception tasks such as detection, tracking, and map segmentation to aid in learning scene representations. However, these methods require extensive annotations, hindering the data scalability. To address this challenge, we propose a novel self-supervised method to enhance end-to-end driving without the need for costly labels. Specifically, our framework textbf{LAW} uses a LAtent World model to predict future latent features based on the predicted ego actions and the latent feature of the current frame. The predicted latent features are supervised by the actually observed features in the future. This supervision jointly optimizes the latent feature learning and action prediction, which greatly enhances the driving performance. As a result, our approach achieves state-of-the-art performance in both open-loop and closed-loop benchmarks without costly annotations.

6/13/2024

cs.CV