UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving

Read original: arXiv:2406.06370 - Published 6/11/2024 by Daniel Bogdoll, Noel Ollick, Tim Joseph, J. Marius Zollner

UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving

Overview

This paper introduces UMAD, an unsupervised method for detecting anomalies in autonomous driving scenarios at the mask level.
UMAD uses an autoencoder to learn a representation of normal driving behavior, then identifies anomalies as regions in the scene that deviate from this learned model.
The authors evaluate UMAD on a large dataset of real-world driving footage and demonstrate its effectiveness at detecting a variety of anomalous events like accidents, jaywalking pedestrians, and unexpected object appearances.

Plain English Explanation

The paper introduces a new way to automatically detect unusual or problematic situations in self-driving car footage. The key idea is to train a neural network, called an autoencoder, to learn what "normal" driving looks like. Then, when presented with new footage, the system can identify parts of the scene that don't match this learned model of normality - these are flagged as potential anomalies.

The advantage of this approach is that it doesn't require labeling every possible type of anomaly ahead of time. Instead, the system learns to recognize anomalies in a more general, unsupervised way. This makes it more flexible and applicable to a wider range of unexpected driving scenarios, like accidents, pedestrians walking in unexpected places, or objects suddenly appearing in the road.

The authors test their UMAD system on a large dataset of real-world driving videos and show that it can effectively detect a variety of anomalous events. This suggests it could be a useful tool for improving the safety and reliability of autonomous vehicles, by helping them quickly identify and respond to unusual or dangerous situations on the road.

Technical Explanation

The core of the UMAD system is an autoencoder neural network. An autoencoder is a type of model that learns to compress input data (in this case, video frames from normal driving) into a compact, low-dimensional representation, and then tries to reconstruct the original input from that compressed encoding.

By training the autoencoder on a large corpus of "normal" driving footage, the authors hypothesize that it will learn to effectively model the typical appearance and dynamics of a driving scene. Then, when presented with new footage, any regions that deviate significantly from this learned model can be flagged as potential anomalies.

To do this, UMAD first uses a pre-trained instance segmentation model to divide each video frame into a set of semantic masks (e.g. identifying the road, vehicles, pedestrians, etc.). It then passes these mask-level inputs through the autoencoder, which outputs a reconstruction of the original masks. The difference between the input masks and the reconstructed masks is used to compute an "anomaly score" for each region in the frame.

The authors evaluate UMAD on the BDD100K dataset, a large collection of real-world driving footage, and show that it is able to effectively detect a variety of anomalous events like accidents, jaywalking pedestrians, and unexpected object appearances. They also compare UMAD to several baseline anomaly detection methods and demonstrate its superiority.

Critical Analysis

One potential limitation of the UMAD approach is that it relies on the quality and accuracy of the pre-trained instance segmentation model. If this underlying model makes mistakes in identifying and delineating the various semantic regions in the scene, it could negatively impact UMAD's ability to accurately detect anomalies.

Additionally, while UMAD is designed to be unsupervised and generalize to a wide range of anomalous events, the authors note that its performance may still be influenced by the specific distribution of "normal" driving scenarios present in the training data. If the model is not exposed to a sufficiently diverse set of normal driving situations, it may struggle to accurately identify true anomalies.

It would also be interesting to see how UMAD performs in more complex driving environments, such as dense urban areas or poor weather conditions, where the definition of "normal" driving may be less clear-cut.

Overall, however, the UMAD approach represents an interesting and potentially valuable contribution to the field of anomaly detection for autonomous driving. Its ability to operate in an unsupervised manner and identify a wide range of anomalous events is an important step towards more robust and reliable self-driving systems.

Conclusion

The UMAD paper introduces a novel unsupervised method for detecting anomalies in autonomous driving scenarios at the mask level. By training an autoencoder on examples of "normal" driving, the system can identify regions in new footage that deviate significantly from this learned model, flagging them as potential anomalies.

The authors demonstrate the effectiveness of UMAD on a large dataset of real-world driving videos, showing its ability to detect a variety of anomalous events like accidents, jaywalking pedestrians, and unexpected object appearances. While the approach has some limitations, it represents an important contribution to the field of anomaly detection for self-driving cars, with the potential to improve the safety and reliability of these systems as they continue to develop.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

UMAD: Unsupervised Mask-Level Anomaly Detection for Autonomous Driving

Daniel Bogdoll, Noel Ollick, Tim Joseph, J. Marius Zollner

Dealing with atypical traffic scenarios remains a challenging task in autonomous driving. However, most anomaly detection approaches cannot be trained on raw sensor data but require exposure to outlier data and powerful semantic segmentation models trained in a supervised fashion. This limits the representation of normality to labeled data, which does not scale well. In this work, we revisit unsupervised anomaly detection and present UMAD, leveraging generative world models and unsupervised image segmentation. Our method outperforms state-of-the-art unsupervised anomaly detection.

6/11/2024

Label-free Anomaly Detection in Aerial Agricultural Images with Masked Image Modeling

Sambal Shikhar, Anupam Sobti

Detecting various types of stresses (nutritional, water, nitrogen, etc.) in agricultural fields is critical for farmers to ensure maximum productivity. However, stresses show up in different shapes and sizes across different crop types and varieties. Hence, this is posed as an anomaly detection task in agricultural images. Accurate anomaly detection in agricultural UAV images is vital for early identification of field irregularities. Traditional supervised learning faces challenges in adapting to diverse anomalies, necessitating extensive annotated data. In this work, we overcome this limitation with self-supervised learning using a masked image modeling approach. Masked Autoencoders (MAE) extract meaningful normal features from unlabeled image samples which produces high reconstruction error for the abnormal pixels during reconstruction. To remove the need of using only ``normal data while training, we use an anomaly suppression loss mechanism that effectively minimizes the reconstruction of anomalous pixels and allows the model to learn anomalous areas without explicitly separating ``normal images for training. Evaluation on the Agriculture-Vision data challenge shows a mIOU score improvement in comparison to prior state of the art in unsupervised and self-supervised methods. A single model generalizes across all the anomaly categories in the Agri-Vision Challenge Dataset

4/16/2024

UMAD: University of Macau Anomaly Detection Benchmark Dataset

Dong Li, Lineng Chen, Cheng-Zhong Xu, Hui Kong

Anomaly detection is critical in surveillance systems and patrol robots by identifying anomalous regions in images for early warning. Depending on whether reference data are utilized, anomaly detection can be categorized into anomaly detection with reference and anomaly detection without reference. Currently, anomaly detection without reference, which is closely related to out-of-distribution (OoD) object detection, struggles with learning anomalous patterns due to the difficulty of collecting sufficiently large and diverse anomaly datasets with the inherent rarity and novelty of anomalies. Alternatively, anomaly detection with reference employs the scheme of change detection to identify anomalies by comparing semantic changes between a reference image and a query one. However, there are very few ADr works due to the scarcity of public datasets in this domain. In this paper, we aim to address this gap by introducing the UMAD Benchmark Dataset. To our best knowledge, this is the first benchmark dataset designed specifically for anomaly detection with reference in robotic patrolling scenarios, e.g., where an autonomous robot is employed to detect anomalous objects by comparing a reference and a query video sequences. The reference sequences can be taken by the robot along a specified route when there are no anomalous objects in the scene. The query sequences are captured online by the robot when it is patrolling in the same scene following the same route. Our benchmark dataset is elaborated such that each query image can find a corresponding reference based on accurate robot localization along the same route in the prebuilt 3D map, with which the reference and query images can be geometrically aligned using adaptive warping. Besides the proposed benchmark dataset, we evaluate the baseline models of ADr on this dataset.

8/23/2024

👨‍🏫

Supervised Anomaly Detection for Complex Industrial Images

Aimira Baitieva, David Hurych, Victor Besnier, Olivier Bernard

Automating visual inspection in industrial production lines is essential for increasing product quality across various industries. Anomaly detection (AD) methods serve as robust tools for this purpose. However, existing public datasets primarily consist of images without anomalies, limiting the practical application of AD methods in production settings. To address this challenge, we present (1) the Valeo Anomaly Dataset (VAD), a novel real-world industrial dataset comprising 5000 images, including 2000 instances of challenging real defects across more than 20 subclasses. Acknowledging that traditional AD methods struggle with this dataset, we introduce (2) Segmentation-based Anomaly Detector (SegAD). First, SegAD leverages anomaly maps as well as segmentation maps to compute local statistics. Next, SegAD uses these statistics and an optional supervised classifier score as input features for a Boosted Random Forest (BRF) classifier, yielding the final anomaly score. Our SegAD achieves state-of-the-art performance on both VAD (+2.1% AUROC) and the VisA dataset (+0.4% AUROC). The code and the models are publicly available.

5/14/2024