Multi-faceted Sensory Substitution for Curb Alerting: A Pilot Investigation in Persons with Blindness and Low Vision

Read original: arXiv:2408.14578 - Published 8/29/2024 by Ligao Ruan, Giles Hamilton-Fletcher, Mahya Beheshti, Todd E Hudson, Maurizio Porfiri, JR Rizzo
Total Score

0

👀

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • Curbs are the raised edges of sidewalks where they meet the street, important for delineating safe pedestrian zones.
  • However, curbs pose significant navigation challenges for people who are blind or have low vision (pBLV).
  • Detecting and properly orienting to these abrupt elevation changes can lead to falls and injuries, despite advancements in assistive technologies.
  • This paper introduces a novel, multi-faceted sensory substitution approach using a smart wearable device to detect and provide early warning of curbs.

Plain English Explanation

The paper discusses the challenges faced by people who are blind or have low vision when navigating urban environments, particularly in detecting curbs. Curbs are the raised edges of sidewalks where they meet the street, and they play a crucial role in delineating safe pedestrian zones from dangerous vehicle lanes. However, for those with visual impairments, these abrupt elevation changes can be difficult to detect and properly orient to, leading to falls and serious injuries.

To address this issue, the researchers have developed a smart wearable device that uses a camera and embedded system to capture and segment curbs in real-time. The system employs a machine learning model called YOLO v8 to identify curbs, and then provides early warning and orientation information to the user through adaptive auditory beeps, abstract sonification, and speech output.

Through user testing, the researchers have demonstrated that their system can provide advanced warning of curbs, giving users a larger safety window compared to using a traditional white cane. Additionally, the system offers nearly identical curb orientation information to the cane, making it a potentially valuable tool for improving the mobility and safety of people with visual impairments.

Technical Explanation

The paper presents a novel, multi-faceted sensory substitution approach to tackle the challenge of detecting and properly orienting to curbs for people who are blind or have low vision (pBLV). The system is hosted on a smart wearable device and leverages an RGB camera and an embedded system to capture and segment curbs in real-time.

The core of the system is a YOLO (You Only Look Once) v8 segmentation model, which has been trained on a custom dataset of curb images to accurately identify these features in the camera input. The output of the system consists of adaptive auditory beeps, abstract sonification, and speech, which convey information about the relative distance and orientation of detected curbs to the user.

Through a human-subjects experiment, the researchers demonstrate the effectiveness of their system compared to the traditional white cane. The results show that the system can provide advanced warning of curbs, offering a larger safety window for users to detect and navigate the elevation change. Additionally, the system provides nearly identical curb orientation information to the cane, making it a viable alternative for pBLV users.

Critical Analysis

The paper presents a promising approach to addressing the significant navigation challenges faced by people who are blind or have low vision when encountering curbs in urban environments. The multi-modal sensory substitution approach, combining visual, auditory, and spatial information, is a well-designed and thoughtful solution to a pressing problem.

However, the paper does mention some limitations and areas for further research. For example, the system's performance may be affected by environmental factors, such as poor lighting or occlusions, which could impact the accuracy of the curb detection. Additionally, the researchers acknowledge the need for more extensive user testing to evaluate the system's long-term usability and acceptance by the target population.

It would also be interesting to explore the integration of the curb detection system with other assistive technologies, such as foundation models or LiDAR-based approaches, to create a more comprehensive and robust navigation solution for pBLV individuals.

Conclusion

The paper presents a novel, multi-faceted sensory substitution approach to detecting and providing early warning of curbs, a significant navigation hazard for people who are blind or have low vision. By leveraging a smart wearable device with a camera and embedded system, the researchers have developed a system that can accurately identify curbs and convey critical information to users through adaptive auditory, spatial, and speech-based cues.

The results of the human-subjects experiment demonstrate the effectiveness of this approach, showing that the system can provide advanced warning of curbs and offer nearly identical orientation information compared to a traditional white cane. This technology has the potential to greatly improve the mobility and safety of people with visual impairments, empowering them to navigate urban environments with greater confidence and independence.

As the researchers continue to refine and expand the capabilities of this system, it will be exciting to see how it can be integrated with other assistive technologies to create even more comprehensive and accessible solutions for this underserved population.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

👀

Total Score

0

Multi-faceted Sensory Substitution for Curb Alerting: A Pilot Investigation in Persons with Blindness and Low Vision

Ligao Ruan, Giles Hamilton-Fletcher, Mahya Beheshti, Todd E Hudson, Maurizio Porfiri, JR Rizzo

Curbs -- the edge of a raised sidewalk at the point where it meets a street -- crucial in urban environments where they help delineate safe pedestrian zones, from dangerous vehicular lanes. However, curbs themselves are significant navigation hazards, particularly for people who are blind or have low vision (pBLV). The challenges faced by pBLV in detecting and properly orientating themselves for these abrupt elevation changes can lead to falls and serious injuries. Despite recent advancements in assistive technologies, the detection and early warning of curbs remains a largely unsolved challenge. This paper aims to tackle this gap by introducing a novel, multi-faceted sensory substitution approach hosted on a smart wearable; the platform leverages an RGB camera and an embedded system to capture and segment curbs in real time and provide early warning and orientation information. The system utilizes YOLO (You Only Look Once) v8 segmentation model, trained on our custom curb dataset for the camera input. The output of the system consists of adaptive auditory beeps, abstract sonification, and speech, conveying information about the relative distance and orientation of curbs. Through human-subjects experimentation, we demonstrate the effectiveness of the system as compared to the white cane. Results show that our system can provide advanced warning through a larger safety window than the cane, while offering nearly identical curb orientation information.

Read more

8/29/2024

👀

Total Score

0

Can Foundation Models Reliably Identify Spatial Hazards? A Case Study on Curb Segmentation

Diwei Sheng, Giles Hamilton-Fletcher, Mahya Beheshti, Chen Feng, John-Ross Rizzo

Curbs serve as vital borders that delineate safe pedestrian zones from potential vehicular traffic hazards. Curbs also represent a primary spatial hazard during dynamic navigation with significant stumbling potential. Such vulnerabilities are particularly exacerbated for persons with blindness and low vision (PBLV). Accurate visual-based discrimination of curbs is paramount for assistive technologies that aid PBLV with safe navigation in urban environments. Herein, we investigate the efficacy of curb segmentation for foundation models. We introduce the largest curb segmentation dataset to-date to benchmark leading foundation models. Our results show that state-of-the-art foundation models face significant challenges in curb segmentation. This is due to their high false-positive rates (up to 95%) with poor performance distinguishing curbs from curb-like objects or non-curb areas, such as sidewalks. In addition, the best-performing model averaged a 3.70-second inference time, underscoring problems in providing real-time assistance. In response, we propose solutions including filtered bounding box selections to achieve more accurate curb segmentation. Overall, despite the immediate flexibility of foundation models, their application for practical assistive technology applications still requires refinement. This research highlights the critical need for specialized datasets and tailored model training to address navigation challenges for PBLV and underscores implicit weaknesses in foundation models.

Read more

6/12/2024

CurbNet: Curb Detection Framework Based on LiDAR Point Cloud Segmentation
Total Score

0

CurbNet: Curb Detection Framework Based on LiDAR Point Cloud Segmentation

Guoyang Zhao, Fulong Ma, Weiqing Qi, Yuxuan Liu, Ming Liu

Curb detection is a crucial function in intelligent driving, essential for determining drivable areas on the road. However, the complexity of road environments makes curb detection challenging. This paper introduces CurbNet, a novel framework for curb detection utilizing point cloud segmentation. To address the lack of comprehensive curb datasets with 3D annotations, we have developed the 3D-Curb dataset based on SemanticKITTI, currently the largest and most diverse collection of curb point clouds. Recognizing that the primary characteristic of curbs is height variation, our approach leverages spatially rich 3D point clouds for training. To tackle the challenges posed by the uneven distribution of curb features on the xy-plane and their dependence on high-frequency features along the z-axis, we introduce the Multi-Scale and Channel Attention (MSCA) module, a customized solution designed to optimize detection performance. Additionally, we propose an adaptive weighted loss function group specifically formulated to counteract the imbalance in the distribution of curb point clouds relative to other categories. Extensive experiments conducted on 2 major datasets demonstrate that our method surpasses existing benchmarks set by leading curb detection and point cloud segmentation models. Through the post-processing refinement of the detection results, we have significantly reduced noise in curb detection, thereby improving precision by 4.5 points. Similarly, our tolerance experiments also achieved state-of-the-art results. Furthermore, real-world experiments and dataset analyses mutually validate each other, reinforcing CurbNet's superior detection capability and robust generalizability. The project website is available at: https://github.com/guoyangzhao/CurbNet/.

Read more

5/31/2024

📈

Total Score

0

A Multi-Modal Foundation Model to Assist People with Blindness and Low Vision in Environmental Interaction

Yu Hao, Fan Yang, Hao Huang, Shuaihang Yuan, Sundeep Rangan, John-Ross Rizzo, Yao Wang, Yi Fang

People with blindness and low vision (pBLV) encounter substantial challenges when it comes to comprehensive scene recognition and precise object identification in unfamiliar environments. Additionally, due to the vision loss, pBLV have difficulty in accessing and identifying potential tripping hazards on their own. In this paper, we present a pioneering approach that leverages a large vision-language model to enhance visual perception for pBLV, offering detailed and comprehensive descriptions of the surrounding environments and providing warnings about the potential risks. Our method begins by leveraging a large image tagging model (i.e., Recognize Anything (RAM)) to identify all common objects present in the captured images. The recognition results and user query are then integrated into a prompt, tailored specifically for pBLV using prompt engineering. By combining the prompt and input image, a large vision-language model (i.e., InstructBLIP) generates detailed and comprehensive descriptions of the environment and identifies potential risks in the environment by analyzing the environmental objects and scenes, relevant to the prompt. We evaluate our approach through experiments conducted on both indoor and outdoor datasets. Our results demonstrate that our method is able to recognize objects accurately and provide insightful descriptions and analysis of the environment for pBLV.

Read more

4/30/2024