Research on target detection method of distracted driving behavior based on improved YOLOv8

Read original: arXiv:2407.01864 - Published 7/8/2024 by Shiquan Shen, Zhizhong Wu, Pan Zhang

🔎

Overview

Existing deep learning-based methods for detecting and classifying distracted driving behavior are computationally intensive and have parameter redundancy.
This study proposes an improved YOLOv8 detection method by integrating the BoTNet module, GAM attention mechanism, and EIoU loss function.
The goal is to optimize feature extraction and multi-scale feature fusion strategies to simplify the training and inference processes, while improving detection accuracy and efficiency.

Plain English Explanation

The researchers wanted to create a more accurate and efficient system for detecting and classifying distracted driving behaviors using deep learning. Existing methods tended to be computationally demanding and had unnecessary complexity, which limited their practical use.

To address this, the researchers took the YOLOv8 model and made several improvements. They integrated the BoTNet module, which helps the model better extract features from the input data. They also added the GAM attention mechanism, which allows the model to focus on the most relevant parts of the image. Finally, they used the EIoU loss function, which helps the model make more accurate bounding box predictions.

These changes simplified the training and inference processes, while significantly improving the model's detection accuracy and speed. The improved model was able to achieve an accuracy rate of 99.4% in identifying and classifying distracted driving behaviors in real-time. This allows the system to provide timely warnings to drivers, which can help enhance overall driving safety.

Technical Explanation

The researchers used the YOLOv8 model as a starting point and made several key modifications to improve its performance for detecting and classifying distracted driving behaviors:

BoTNet Module Integration: The researchers integrated the BoTNet module into the feature extraction backbone of the YOLOv8 model. The BoTNet module is a type of attention-based convolutional neural network that can better capture and represent spatial relationships in the input data, leading to more robust feature extraction.
GAM Attention Mechanism: The researchers added a Guided Attention Mechanism (GAM) to the model. This attention mechanism helps the model focus on the most relevant parts of the input image, further enhancing the feature extraction process.
EIoU Loss Function: The researchers used the Enclosed IoU (EIoU) loss function to optimize the model's bounding box predictions. The EIoU loss function is a variant of the traditional IoU (Intersection over Union) loss that can provide more accurate and stable training, resulting in better object detection performance.

By integrating these components, the researchers were able to optimize the feature extraction and multi-scale feature fusion strategies of the YOLOv8 model. This simplifies the training and inference processes, while significantly improving the detection accuracy and efficiency for identifying and classifying distracted driving behaviors.

Critical Analysis

The researchers have provided a thorough and well-designed study to address the limitations of existing deep learning-based methods for detecting and classifying distracted driving behaviors. The integration of the BoTNet module, GAM attention mechanism, and EIoU loss function appears to be a well-thought-out approach to improve the model's performance.

One potential limitation is the specific dataset used for training and evaluation. The researchers mention that the model achieves a 99.4% accuracy rate, but it's important to understand the diversity and representativeness of the dataset to ensure the model's generalization to real-world scenarios. It would be beneficial to see the model's performance evaluated on additional datasets or in more diverse driving conditions.

Furthermore, while the researchers discuss the model's efficiency in terms of reduced computational complexity and ease of deployment, it would be valuable to provide more detailed benchmarks or comparisons to other state-of-the-art models to quantify the improvements more precisely.

Overall, the researchers have presented a promising approach to enhance the detection and classification of distracted driving behaviors using deep learning. Further research and validation on larger and more diverse datasets would help strengthen the findings and ensure the model's robustness for practical applications.

Conclusion

This study proposes an improved YOLOv8 detection method that integrates the BoTNet module, GAM attention mechanism, and EIoU loss function to address the limitations of existing deep learning-based approaches for detecting and classifying distracted driving behaviors.

The researchers were able to optimize the feature extraction and multi-scale feature fusion strategies, simplifying the training and inference processes while significantly improving the detection accuracy and efficiency. The improved model achieved a 99.4% accuracy rate in identifying and classifying distracted driving behaviors in real-time, enabling timely warnings to enhance overall driving safety.

This research demonstrates the potential of advanced deep learning techniques to tackle complex computer vision problems in the automotive domain, with practical implications for improving road safety through the detection and mitigation of distracted driving behaviors.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔎

Research on target detection method of distracted driving behavior based on improved YOLOv8

Shiquan Shen, Zhizhong Wu, Pan Zhang

With the development of deep learning technology, the detection and classification of distracted driving behaviour requires higher accuracy. Existing deep learning-based methods are computationally intensive and parameter redundant, limiting the efficiency and accuracy in practical applications. To solve this problem, this study proposes an improved YOLOv8 detection method based on the original YOLOv8 model by integrating the BoTNet module, GAM attention mechanism and EIoU loss function. By optimising the feature extraction and multi-scale feature fusion strategies, the training and inference processes are simplified, and the detection accuracy and efficiency are significantly improved. Experimental results show that the improved model performs well in both detection speed and accuracy, with an accuracy rate of 99.4%, and the model is smaller and easy to deploy, which is able to identify and classify distracted driving behaviours in real time, provide timely warnings, and enhance driving safety.

7/8/2024

🔎

Research on Driver Facial Fatigue Detection Based on Yolov8 Model

Chang Zhou, Yang Zhao, Shaobo Liu, Yi Zhao, Xingchen Li, Chiyu Cheng

In a society where traffic accidents frequently occur, fatigue driving has emerged as a grave issue. Fatigue driving detection technology, especially those based on the YOLOv8 deep learning model, has seen extensive research and application as an effective preventive measure. This paper discusses in depth the methods and technologies utilized in the YOLOv8 model to detect driver fatigue, elaborates on the current research status both domestically and internationally, and systematically introduces the processing methods and algorithm principles for various datasets. This study aims to provide a robust technical solution for preventing and detecting fatigue driving, thereby contributing significantly to reducing traffic accidents and safeguarding lives.

6/28/2024

Real-Time Detection and Analysis of Vehicles and Pedestrians using Deep Learning

Md Nahid Sadik, Tahmim Hossain, Faisal Sayeed

Computer vision, particularly vehicle and pedestrian identification is critical to the evolution of autonomous driving, artificial intelligence, and video surveillance. Current traffic monitoring systems confront major difficulty in recognizing small objects and pedestrians effectively in real-time, posing a serious risk to public safety and contributing to traffic inefficiency. Recognizing these difficulties, our project focuses on the creation and validation of an advanced deep-learning framework capable of processing complex visual input for precise, real-time recognition of cars and people in a variety of environmental situations. On a dataset representing complicated urban settings, we trained and evaluated different versions of the YOLOv8 and RT-DETR models. The YOLOv8 Large version proved to be the most effective, especially in pedestrian recognition, with great precision and robustness. The results, which include Mean Average Precision and recall rates, demonstrate the model's ability to dramatically improve traffic monitoring and safety. This study makes an important addition to real-time, reliable detection in computer vision, establishing new benchmarks for traffic management systems.

4/15/2024

You Only Look at Once for Real-time and Generic Multi-Task

Jiayuan Wang, Q. M. Jonathan Wu, Ning Zhang

High precision, lightweight, and real-time responsiveness are three essential requirements for implementing autonomous driving. In this study, we incorporate A-YOLOM, an adaptive, real-time, and lightweight multi-task model designed to concurrently address object detection, drivable area segmentation, and lane line segmentation tasks. Specifically, we develop an end-to-end multi-task model with a unified and streamlined segmentation structure. We introduce a learnable parameter that adaptively concatenates features between necks and backbone in segmentation tasks, using the same loss function for all segmentation tasks. This eliminates the need for customizations and enhances the model's generalization capabilities. We also introduce a segmentation head composed only of a series of convolutional layers, which reduces the number of parameters and inference time. We achieve competitive results on the BDD100k dataset, particularly in visualization outcomes. The performance results show a mAP50 of 81.1% for object detection, a mIoU of 91.0% for drivable area segmentation, and an IoU of 28.8% for lane line segmentation. Additionally, we introduce real-world scenarios to evaluate our model's performance in a real scene, which significantly outperforms competitors. This demonstrates that our model not only exhibits competitive performance but is also more flexible and faster than existing multi-task models. The source codes and pre-trained models are released at https://github.com/JiayuanWang-JW/YOLOv8-multi-task

4/26/2024