Stranger Danger! Identifying and Avoiding Unpredictable Pedestrians in RL-based Social Robot Navigation

Read original: arXiv:2407.06056 - Published 7/9/2024 by Sara Pohland, Alvin Tan, Prabal Dutta, Claire Tomlin

Overview

This paper investigates how to identify and avoid unpredictable pedestrians when using reinforcement learning (RL) for social robot navigation.
The researchers propose a novel approach to detect and navigate around "stranger danger" - pedestrians whose future movements are highly unpredictable.
They evaluate their method through simulations and real-world experiments, showing it can improve navigation safety and efficiency compared to existing techniques.

Plain English Explanation

The goal of this research is to help social robots navigate safely around pedestrians, even ones whose future movements are hard to predict. When a robot is moving through a crowded area, it's important that it can identify people who might suddenly change direction or speed, and then adjust its own path to avoid collisions.

The researchers developed a new system that allows the robot to detect these "unpredictable" pedestrians. By analyzing sensor data about the people around it, the robot can estimate how likely each person is to make unexpected movements. It then uses this information to plan a navigation path that steers clear of the riskier pedestrians.

Through simulations and real-world tests, the researchers showed that their approach improved the robot's safety and efficiency compared to previous methods. The robot was better able to navigate crowded spaces without colliding with people or having to make sudden stops and turns.

This work is important for making social robots more reliable and trustworthy as they interact with humans in public settings. Being able to identify and avoid unpredictable pedestrians is a key capability for robots to safely coexist with people in the real world.

Technical Explanation

The researchers propose a novel framework for identifying and avoiding unpredictable pedestrians in reinforcement learning-based social robot navigation. They first develop a Uncertainty-Aware Deep Reinforcement Learning (UA-DRL) model to estimate the uncertainty in predicting each pedestrian's future trajectory. This allows the robot to identify which individuals are more likely to make unexpected movements.

The robot then uses a multi-robot cooperative navigation strategy to adjust its path, prioritizing avoidance of the high-uncertainty "stranger danger" pedestrians. This is implemented using a Structured Graph Network to model the social interactions and plan a collision-free trajectory.

Through extensive simulations and real-world experiments, the researchers demonstrate that their approach significantly improves the robot's navigation safety and efficiency compared to prior socially-aware navigation and imitation learning techniques. The robot is better able to identify high-risk pedestrians and plan paths that safely avoid them.

Critical Analysis

The researchers acknowledge some limitations of their work. Their experiments were conducted in relatively controlled environments, and further testing is needed to evaluate performance in more complex, real-world settings with higher pedestrian density and unpredictability.

Additionally, the proposed methods rely on accurate sensing and prediction of pedestrian trajectories, which can be challenging in practice due to sensor noise, occlusions, and the inherent uncertainty in human behavior. The researchers note that incorporating more robust perception and forecasting techniques could further improve the system's reliability.

Another potential issue is the interpretability of the robot's decision-making process. While the Structured Graph Network provides a structured representation of the social interactions, the overall system may still be difficult for humans to understand and trust. Approaches that offer more transparency could be valuable for deploying these systems in public spaces.

Overall, this work represents an important step towards safer and more efficient social robot navigation. By explicitly considering the risk posed by unpredictable pedestrians, the researchers have developed a practical solution that could significantly enhance the real-world performance of autonomous systems interacting with humans.

Conclusion

This paper presents a novel framework for identifying and avoiding unpredictable pedestrians in reinforcement learning-based social robot navigation. By estimating the uncertainty in predicting each person's future trajectory, the robot can detect "stranger danger" individuals and plan paths that safely navigate around them.

The researchers' extensive simulations and real-world experiments demonstrate the effectiveness of their approach, showcasing improved safety and efficiency compared to prior state-of-the-art techniques. This work represents an important advancement in making social robots more reliable and trustworthy as they interact with humans in public settings.

The researchers acknowledge some limitations and areas for future work, such as improving the system's performance in more complex environments and enhancing the interpretability of the robot's decision-making. Nonetheless, this research represents a significant step forward in addressing a critical challenge for the widespread deployment of social robots in the real world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Stranger Danger! Identifying and Avoiding Unpredictable Pedestrians in RL-based Social Robot Navigation

Sara Pohland, Alvin Tan, Prabal Dutta, Claire Tomlin

Reinforcement learning (RL) methods for social robot navigation show great success navigating robots through large crowds of people, but the performance of these learning-based methods tends to degrade in particularly challenging or unfamiliar situations due to the models' dependency on representative training data. To ensure human safety and comfort, it is critical that these algorithms handle uncommon cases appropriately, but the low frequency and wide diversity of such situations present a significant challenge for these data-driven methods. To overcome this challenge, we propose modifications to the learning process that encourage these RL policies to maintain additional caution in unfamiliar situations. Specifically, we improve the Socially Attentive Reinforcement Learning (SARL) policy by (1) modifying the training process to systematically introduce deviations into a pedestrian model, (2) updating the value network to estimate and utilize pedestrian-unpredictability features, and (3) implementing a reward function to learn an effective response to pedestrian unpredictability. Compared to the original SARL policy, our modified policy maintains similar navigation times and path lengths, while reducing the number of collisions by 82% and reducing the proportion of time spent in the pedestrians' personal space by up to 19 percentage points for the most difficult cases. We also describe how to apply these modifications to other RL policies and demonstrate that some key high-level behaviors of our approach transfer to a physical robot.

7/9/2024

Uncertainty-Aware DRL for Autonomous Vehicle Crowd Navigation in Shared Space

Mahsa Golchoubian, Moojan Ghafurian, Kerstin Dautenhahn, Nasser Lashgarian Azad

Safe, socially compliant, and efficient navigation of low-speed autonomous vehicles (AVs) in pedestrian-rich environments necessitates considering pedestrians' future positions and interactions with the vehicle and others. Despite the inevitable uncertainties associated with pedestrians' predicted trajectories due to their unobserved states (e.g., intent), existing deep reinforcement learning (DRL) algorithms for crowd navigation often neglect these uncertainties when using predicted trajectories to guide policy learning. This omission limits the usability of predictions when diverging from ground truth. This work introduces an integrated prediction and planning approach that incorporates the uncertainties of predicted pedestrian states in the training of a model-free DRL algorithm. A novel reward function encourages the AV to respect pedestrians' personal space, decrease speed during close approaches, and minimize the collision probability with their predicted paths. Unlike previous DRL methods, our model, designed for AV operation in crowded spaces, is trained in a novel simulation environment that reflects realistic pedestrian behaviour in a shared space with vehicles. Results show a 40% decrease in collision rate and a 15% increase in minimum distance to pedestrians compared to the state of the art model that does not account for prediction uncertainty. Additionally, the approach outperforms model predictive control methods that incorporate the same prediction uncertainties in terms of both performance and computational time, while producing trajectories closer to human drivers in similar scenarios.

5/24/2024

Disentangling Uncertainty for Safe Social Navigation using Deep Reinforcement Learning

Daniel Flogel, Marcos G'omez Villafa~ne, Joshua Ransiek, Soren Hohmann

Autonomous mobile robots are increasingly employed in pedestrian-rich environments where safe navigation and appropriate human interaction are crucial. While Deep Reinforcement Learning (DRL) enables socially integrated robot behavior, challenges persist in novel or perturbed scenarios to indicate when and why the policy is uncertain. Unknown uncertainty in decision-making can lead to collisions or human discomfort and is one reason why safe and risk-aware navigation is still an open problem. This work introduces a novel approach that integrates aleatoric, epistemic, and predictive uncertainty estimation into a DRL-based navigation framework for uncertainty estimates in decision-making. We, therefore, incorporate Observation-Dependent Variance (ODV) and dropout into the Proximal Policy Optimization (PPO) algorithm. For different types of perturbations, we compare the ability of Deep Ensembles and Monte-Carlo Dropout (MC-Dropout) to estimate the uncertainties of the policy. In uncertain decision-making situations, we propose to change the robot's social behavior to conservative collision avoidance. The results show that the ODV-PPO algorithm converges faster with better generalization and disentangles the aleatoric and epistemic uncertainties. In addition, the MC-Dropout approach is more sensitive to perturbations and capable to correlate the uncertainty type to the perturbation type better. With the proposed safe action selection scheme, the robot can navigate in perturbed environments with fewer collisions.

9/18/2024

Multi-Robot Cooperative Socially-Aware Navigation Using Multi-Agent Reinforcement Learning

Weizheng Wang, Le Mao, Ruiqi Wang, Byung-Cheol Min

In public spaces shared with humans, ensuring multi-robot systems navigate without collisions while respecting social norms is challenging, particularly with limited communication. Although current robot social navigation techniques leverage advances in reinforcement learning and deep learning, they frequently overlook robot dynamics in simulations, leading to a simulation-to-reality gap. In this paper, we bridge this gap by presenting a new multi-robot social navigation environment crafted using Dec-POSMDP and multi-agent reinforcement learning. Furthermore, we introduce SAMARL: a novel benchmark for cooperative multi-robot social navigation. SAMARL employs a unique spatial-temporal transformer combined with multi-agent reinforcement learning. This approach effectively captures the complex interactions between robots and humans, thus promoting cooperative tendencies in multi-robot systems. Our extensive experiments reveal that SAMARL outperforms existing baseline and ablation models in our designed environment. Demo videos for this work can be found at: https://sites.google.com/view/samarl

5/17/2024