Performance Evaluation of Real-Time Object Detection for Electric Scooters

2405.03039

Published 5/7/2024 by Dong Chen, Arman Hosseini, Arik Smith, Amir Farzin Nikkhah, Arsalan Heydarian, Omid Shoghli, Bradford Campbell

cs.CV cs.SY eess.SY

🚀

Abstract

Electric scooters (e-scooters) have rapidly emerged as a popular mode of transportation in urban areas, yet they pose significant safety challenges. In the United States, the rise of e-scooters has been marked by a concerning increase in related injuries and fatalities. Recently, while deep-learning object detection holds paramount significance in autonomous vehicles to avoid potential collisions, its application in the context of e-scooters remains relatively unexplored. This paper addresses this gap by assessing the effectiveness and efficiency of cutting-edge object detectors designed for e-scooters. To achieve this, the first comprehensive benchmark involving 22 state-of-the-art YOLO object detectors, including five versions (YOLOv3, YOLOv5, YOLOv6, YOLOv7, and YOLOv8), has been established for real-time traffic object detection using a self-collected dataset featuring e-scooters. The detection accuracy, measured in terms of [email protected], ranges from 27.4% (YOLOv7-E6E) to 86.8% (YOLOv5s). All YOLO models, particularly YOLOv3-tiny, have displayed promising potential for real-time object detection in the context of e-scooters. Both the traffic scene dataset (https://zenodo.org/records/10578641) and software program codes (https://github.com/DongChen06/ScooterDet) for model benchmarking in this study are publicly available, which will not only improve e-scooter safety with advanced object detection but also lay the groundwork for tailored solutions, promising a safer and more sustainable urban micromobility landscape.

Create account to get full access

Overview

E-scooters have become a popular mode of transportation in urban areas, but pose significant safety challenges
Existing research on using deep learning for object detection to improve e-scooter safety is limited
This paper addresses this gap by benchmarking 22 state-of-the-art YOLO object detectors on a dataset of real-world e-scooter traffic scenes

Plain English Explanation

Electric scooters, or e-scooters, have become a common sight in many cities as a convenient way for people to get around. However, the rise in e-scooter usage has also been accompanied by an increase in accidents and injuries. To help improve e-scooter safety, this research looked at using advanced AI object detection models to automatically detect e-scooters and other vehicles or pedestrians in real-time.

The researchers created a dataset of real-world e-scooter traffic scenes and then tested 22 different versions of the YOLO object detection algorithm, which is known for its speed and accuracy. They found that some of the YOLO models, particularly the smaller YOLOv3-tiny version, were able to detect e-scooters quite well, with accuracy up to 86.8%. This suggests that these AI models could be used in e-scooter safety systems to help avoid collisions.

The researchers have made their dataset and code publicly available, which should help advance research in this area and lead to safer e-scooter usage in cities.

Technical Explanation

The researchers established the first comprehensive benchmark for evaluating the performance of state-of-the-art YOLO object detection models on the task of detecting e-scooters and other traffic objects in real-world scenes. They tested 22 YOLO models, including the latest versions (YOLOv3, YOLOv5, YOLOv6, YOLOv7, and YOLOv8), on a self-collected dataset of e-scooter traffic scenes.

The detection accuracy, measured in terms of mean average precision (mAP) at an intersection-over-union (IoU) threshold of 0.5, ranged from 27.4% for the YOLOv7-E6E model to 86.8% for the YOLOv5s model. Overall, the YOLO models displayed promising potential for real-time object detection in the e-scooter context, with the smaller YOLOv3-tiny model performing particularly well.

The researchers have made both the dataset and the software code for the benchmark publicly available, which should help advance research on using AI-based object detection to improve e-scooter safety and support the development of tailored solutions for the urban micromobility landscape.

Critical Analysis

While the researchers have made a valuable contribution by establishing a benchmark for evaluating object detection models in the e-scooter context, there are a few potential limitations and areas for further research:

The dataset used for the benchmark, while a good starting point, may not capture the full diversity of e-scooter traffic scenes encountered in real-world urban environments. Expanding the dataset to include a wider range of scenarios could help further validate the performance of the object detection models.
The paper does not explore the potential impact of environmental factors, such as weather conditions or lighting, on the object detection accuracy. Evaluating the models' robustness to these variations would be an important next step.
The researchers focus solely on the YOLO family of object detectors. Incorporating and benchmarking other state-of-the-art detection algorithms, such as DETR or Attention YOLO, could provide a more comprehensive understanding of the object detection landscape for e-scooter safety applications.

Overall, this research represents a valuable step forward in using AI-based object detection to improve e-scooter safety, but further exploration and validation will be necessary to develop reliable and effective solutions for real-world deployment.

Conclusion

This paper addresses an important and timely issue by assessing the capabilities of state-of-the-art object detection models in the context of e-scooters, a rapidly growing mode of urban transportation that poses significant safety challenges. The researchers have established a comprehensive benchmark and made the necessary resources publicly available, which should help advance research and development in this area.

The findings suggest that advanced YOLO object detectors, particularly the smaller and more efficient versions, have the potential to be effectively deployed in e-scooter safety systems to detect and avoid potential collisions. As e-scooter usage continues to grow, this research lays the groundwork for tailored solutions that can contribute to a safer and more sustainable urban micromobility landscape.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Real-Time Detection and Analysis of Vehicles and Pedestrians using Deep Learning

Md Nahid Sadik, Tahmim Hossain, Faisal Sayeed

Computer vision, particularly vehicle and pedestrian identification is critical to the evolution of autonomous driving, artificial intelligence, and video surveillance. Current traffic monitoring systems confront major difficulty in recognizing small objects and pedestrians effectively in real-time, posing a serious risk to public safety and contributing to traffic inefficiency. Recognizing these difficulties, our project focuses on the creation and validation of an advanced deep-learning framework capable of processing complex visual input for precise, real-time recognition of cars and people in a variety of environmental situations. On a dataset representing complicated urban settings, we trained and evaluated different versions of the YOLOv8 and RT-DETR models. The YOLOv8 Large version proved to be the most effective, especially in pedestrian recognition, with great precision and robustness. The results, which include Mean Average Precision and recall rates, demonstrate the model's ability to dramatically improve traffic monitoring and safety. This study makes an important addition to real-time, reliable detection in computer vision, establishing new benchmarks for traffic management systems.

4/15/2024

cs.CV

A Review and Implementation of Object Detection Models and Optimizations for Real-time Medical Mask Detection during the COVID-19 Pandemic

Ioanna Gogou, Dimitrios Koutsomitropoulos

Convolutional Neural Networks (CNN) are commonly used for the problem of object detection thanks to their increased accuracy. Nevertheless, the performance of CNN-based detection models is ambiguous when detection speed is considered. To the best of our knowledge, there has not been sufficient evaluation of the available methods in terms of the speed/accuracy trade-off in related literature. This work assesses the most fundamental object detection models on the Common Objects in Context (COCO) dataset with respect to this trade-off, their memory consumption, and computational and storage cost. Next, we select a highly efficient model called YOLOv5 to train on the topical and unexplored dataset of human faces with medical masks, the Properly-Wearing Masked Faces Dataset (PWMFD), and analyze the benefits of specific optimization techniques for real-time medical mask detection: transfer learning, data augmentations, and a Squeeze-and-Excitation attention mechanism. Using our findings in the context of the COVID-19 pandemic, we propose an optimized model based on YOLOv5s using transfer learning for the detection of correctly and incorrectly worn medical masks that surpassed more than two times in speed (69 frames per second) the state-of-the-art model SE-YOLOv3 on the PWMFD dataset while maintaining the same level of mean Average Precision (67%).

5/29/2024

cs.CV cs.AI

🔎

Real-Time Flying Object Detection with YOLOv8

Dillon Reis, Jordan Kupec, Jacqueline Hong, Ahmad Daoudi

This paper presents a generalized model for real-time detection of flying objects that can be used for transfer learning and further research, as well as a refined model that achieves state-of-the-art results for flying object detection. We achieve this by training our first (generalized) model on a data set containing 40 different classes of flying objects, forcing the model to extract abstract feature representations. We then perform transfer learning with these learned parameters on a data set more representative of real world environments (i.e. higher frequency of occlusion, very small spatial sizes, rotations, etc.) to generate our refined model. Object detection of flying objects remains challenging due to large variances of object spatial sizes/aspect ratios, rate of speed, occlusion, and clustered backgrounds. To address some of the presented challenges while simultaneously maximizing performance, we utilize the current state-of-the-art single-shot detector, YOLOv8, in an attempt to find the best trade-off between inference speed and mean average precision (mAP). While YOLOv8 is being regarded as the new state-of-the-art, an official paper has not been released as of yet. Thus, we provide an in-depth explanation of the new architecture and functionality that YOLOv8 has adapted. Our final generalized model achieves a mAP50 of 79.2%, mAP50-95 of 68.5%, and an average inference speed of 50 frames per second (fps) on 1080p videos. Our final refined model maintains this inference speed and achieves an improved mAP50 of 99.1% and mAP50-95 of 83.5%

5/24/2024

cs.CV cs.LG

🔎

YOLOv10: Real-Time End-to-End Object Detection

Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin, Jungong Han, Guiguang Ding

Over the past years, YOLOs have emerged as the predominant paradigm in the field of real-time object detection owing to their effective balance between computational cost and detection performance. Researchers have explored the architectural designs, optimization objectives, data augmentation strategies, and others for YOLOs, achieving notable progress. However, the reliance on the non-maximum suppression (NMS) for post-processing hampers the end-to-end deployment of YOLOs and adversely impacts the inference latency. Besides, the design of various components in YOLOs lacks the comprehensive and thorough inspection, resulting in noticeable computational redundancy and limiting the model's capability. It renders the suboptimal efficiency, along with considerable potential for performance improvements. In this work, we aim to further advance the performance-efficiency boundary of YOLOs from both the post-processing and model architecture. To this end, we first present the consistent dual assignments for NMS-free training of YOLOs, which brings competitive performance and low inference latency simultaneously. Moreover, we introduce the holistic efficiency-accuracy driven model design strategy for YOLOs. We comprehensively optimize various components of YOLOs from both efficiency and accuracy perspectives, which greatly reduces the computational overhead and enhances the capability. The outcome of our effort is a new generation of YOLO series for real-time end-to-end object detection, dubbed YOLOv10. Extensive experiments show that YOLOv10 achieves state-of-the-art performance and efficiency across various model scales. For example, our YOLOv10-S is 1.8$times$ faster than RT-DETR-R18 under the similar AP on COCO, meanwhile enjoying 2.8$times$ smaller number of parameters and FLOPs. Compared with YOLOv9-C, YOLOv10-B has 46% less latency and 25% fewer parameters for the same performance.

5/24/2024

cs.CV