Driver Attention Tracking and Analysis

2404.07122

Published 4/12/2024 by Dat Viet Thanh Nguyen, Anh Tran, Hoai Nam Vu, Cuong Pham, Minh Hoai

Abstract

We propose a novel method to estimate a driver's points-of-gaze using a pair of ordinary cameras mounted on the windshield and dashboard of a car. This is a challenging problem due to the dynamics of traffic environments with 3D scenes of unknown depths. This problem is further complicated by the volatile distance between the driver and the camera system. To tackle these challenges, we develop a novel convolutional network that simultaneously analyzes the image of the scene and the image of the driver's face. This network has a camera calibration module that can compute an embedding vector that represents the spatial configuration between the driver and the camera system. This calibration module improves the overall network's performance, which can be jointly trained end to end. We also address the lack of annotated data for training and evaluation by introducing a large-scale driving dataset with point-of-gaze annotations. This is an in situ dataset of real driving sessions in an urban city, containing synchronized images of the driving scene as well as the face and gaze of the driver. Experiments on this dataset show that the proposed method outperforms various baseline methods, having the mean prediction error of 29.69 pixels, which is relatively small compared to the $1280{times}720$ resolution of the scene camera.

Create account to get full access

Overview

This paper presents a dataset and analysis of drivers' points-of-gaze to better understand driver attention and behavior.
The dataset consists of eye-tracking data collected from drivers in real-world conditions, providing insights into how drivers allocate their visual attention while operating a vehicle.
The researchers analyze the dataset to identify patterns and trends in driver attention, with potential applications in areas like driver assistance systems and autonomous vehicle development.

Plain English Explanation

This research explores how drivers focus their attention while behind the wheel. The researchers collected data by tracking the eye movements of drivers in real driving situations. This allowed them to see exactly where the drivers were looking and for how long.

By analyzing this data, the researchers were able to identify common patterns in how drivers distribute their visual attention. For example, they may have noticed that drivers tend to focus more on the road ahead than on their side mirrors, or that they frequently glance at their speedometer.

Understanding these attention patterns could be very useful for developing advanced driver assistance systems or self-driving cars. These technologies could be designed to support the driver's natural attention behaviors, or to alert the driver if their attention seems to be wandering from the road. Overall, this research provides valuable insights into the complex cognitive processes involved in operating a vehicle safely.

Technical Explanation

The paper presents a Drivers' Points-of-Gaze Dataset collected through eye-tracking experiments with real-world drivers. The dataset includes detailed gaze data, such as the location and duration of drivers' visual fixations, which the researchers analyze to uncover patterns in driver attention allocation.

The authors conduct several analyses on the dataset, including:

Identifying common regions of interest (e.g. the road ahead, side mirrors) that drivers tend to focus on
Examining how driver attention shifts over time and in response to different driving scenarios
Investigating individual differences in attention strategies between different drivers

The findings suggest that driver attention is a complex, dynamic process that can be influenced by factors like driving conditions, task demands, and personal driving styles. The researchers discuss how this dataset and analysis could inform the development of advanced driver assistance systems and autonomous vehicles that can better adapt to and support human attention behaviors.

Critical Analysis

The Drivers' Points-of-Gaze Dataset presented in this paper provides a valuable resource for understanding driver attention and behavior in real-world conditions. By collecting detailed eye-tracking data, the researchers are able to gain insights that go beyond what could be observed through traditional methods like surveys or controlled experiments.

However, the dataset is limited to a relatively small number of participants, which may constrain the generalizability of the findings. Additionally, the study does not delve deeply into factors that could influence driver attention, such as driver demographics, vehicle type, or road environment. Expanding the dataset and conducting more comprehensive analyses could lead to a richer understanding of the complex cognitive processes involved in driving.

The authors also acknowledge the potential privacy and ethical concerns around collecting and analyzing driver eye-tracking data. As driver assistance systems and autonomous vehicles become more prevalent, it will be crucial to address these issues and ensure that such technologies are developed and deployed in a responsible manner that respects the privacy and safety of drivers and passengers.

Conclusion

The Drivers' Points-of-Gaze Dataset and associated analysis presented in this paper offer valuable insights into how drivers allocate their visual attention during real-world driving. By leveraging eye-tracking technology, the researchers were able to gain a detailed understanding of driver attention patterns, which could inform the development of advanced driver assistance systems and autonomous vehicles that are better aligned with human attention and behavior. While the dataset has some limitations, this work represents an important step towards a more comprehensive understanding of the cognitive processes involved in safe and effective driving.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🤔

Understanding and Modeling the Effects of Task and Context on Drivers' Gaze Allocation

Iuliia Kotseruba, John K. Tsotsos

To further advance driver monitoring and assistance systems, it is important to understand how drivers allocate their attention, in other words, where do they tend to look and why. Traditionally, factors affecting human visual attention have been divided into bottom-up (involuntary attraction to salient regions) and top-down (driven by the demands of the task being performed). Although both play a role in directing drivers' gaze, most of the existing models for drivers' gaze prediction apply techniques developed for bottom-up saliency and do not consider influences of the drivers' actions explicitly. Likewise, common driving attention benchmarks lack relevant annotations for drivers' actions and the context in which they are performed. Therefore, to enable analysis and modeling of these factors for drivers' gaze prediction, we propose the following: 1) we correct the data processing pipeline used in DR(eye)VE to reduce noise in the recorded gaze data; 2) we then add per-frame labels for driving task and context; 3) we benchmark a number of baseline and SOTA models for saliency and driver gaze prediction and use new annotations to analyze how their performance changes in scenarios involving different tasks; and, lastly, 4) we develop a novel model that modulates drivers' gaze prediction with explicit action and context information. While reducing noise in the DR(eye)VE gaze data improves results of all models, we show that using task information in our proposed model boosts performance even further compared to bottom-up models on the cleaned up data, both overall (by 24% KLD and 89% NSS) and on scenarios that involve performing safety-critical maneuvers and crossing intersections (by up to 10--30% KLD). Extended annotations and code are available at https://github.com/ykotseruba/SCOUT.

4/16/2024

cs.CV

SCOUT+: Towards Practical Task-Driven Drivers' Gaze Prediction

Iuliia Kotseruba, John K. Tsotsos

Accurate prediction of drivers' gaze is an important component of vision-based driver monitoring and assistive systems. Of particular interest are safety-critical episodes, such as performing maneuvers or crossing intersections. In such scenarios, drivers' gaze distribution changes significantly and becomes difficult to predict, especially if the task and context information is represented implicitly, as is common in many state-of-the-art models. However, explicit modeling of top-down factors affecting drivers' attention often requires additional information and annotations that may not be readily available. In this paper, we address the challenge of effective modeling of task and context with common sources of data for use in practical systems. To this end, we introduce SCOUT+, a task- and context-aware model for drivers' gaze prediction, which leverages route and map information inferred from commonly available GPS data. We evaluate our model on two datasets, DR(eye)VE and BDD-A, and demonstrate that using maps improves results compared to bottom-up models and reaches performance comparable to the top-down model SCOUT which relies on privileged ground truth information. Code is available at https://github.com/ykotseruba/SCOUT.

4/16/2024

cs.CV

Guiding Attention in End-to-End Driving Models

Diego Porres, Yi Xiao, Gabriel Villalonga, Alexandre Levy, Antonio M. L'opez

Vision-based end-to-end driving models trained by imitation learning can lead to affordable solutions for autonomous driving. However, training these well-performing models usually requires a huge amount of data, while still lacking explicit and intuitive activation maps to reveal the inner workings of these models while driving. In this paper, we study how to guide the attention of these models to improve their driving quality and obtain more intuitive activation maps by adding a loss term during training using salient semantic maps. In contrast to previous work, our method does not require these salient semantic maps to be available during testing time, as well as removing the need to modify the model's architecture to which it is applied. We perform tests using perfect and noisy salient semantic maps with encouraging results in both, the latter of which is inspired by possible errors encountered with real data. Using CIL++ as a representative state-of-the-art model and the CARLA simulator with its standard benchmarks, we conduct experiments that show the effectiveness of our method in training better autonomous driving models, especially when data and computational resources are scarce.

5/2/2024

cs.CV cs.AI

Enhancing Road Safety: Real-Time Detection of Driver Distraction through Convolutional Neural Networks

Amaan Aijaz Sheikh, Imaad Zaffar Khan

As we navigate our daily commutes, the threat posed by a distracted driver is at a large, resulting in a troubling rise in traffic accidents. Addressing this safety concern, our project harnesses the analytical power of Convolutional Neural Networks (CNNs), with a particular emphasis on the well-established models VGG16 and VGG19. These models are acclaimed for their precision in image recognition and are meticulously tested for their ability to detect nuances in driver behavior under varying environmental conditions. Through a comparative analysis against an array of CNN architectures, this study seeks to identify the most efficient model for real-time detection of driver distractions. The ultimate aim is to incorporate the findings into vehicle safety systems, significantly boosting their capability to prevent accidents triggered by inattention. This research not only enhances our understanding of automotive safety technologies but also marks a pivotal step towards creating vehicles that are intuitively aligned with driver behaviors, ensuring safer roads for all.

5/29/2024

cs.CV