DISHA: Low-Energy Sparse Transformer at Edge for Outdoor Navigation for the Visually Impaired Individuals

Read original: arXiv:2406.15864 - Published 6/26/2024 by Praveen Nagil, Sumit K. Mandal
Total Score

0

DISHA: Low-Energy Sparse Transformer at Edge for Outdoor Navigation for the Visually Impaired Individuals

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper presents DISHA, a low-energy, sparse Transformer-based system for outdoor navigation assistance for visually impaired individuals.
  • The system uses a novel sparse Transformer architecture to efficiently process visual inputs from a camera and provide navigation guidance.
  • Key innovations include a low-power edge-computing design and specialized training for outdoor navigation tasks.

Plain English Explanation

DISHA is a new technology designed to help visually impaired people navigate outdoor environments more easily. It uses a camera and a type of artificial intelligence called a Transformer to analyze the surroundings and provide guidance.

The Transformer architecture in DISHA is specially designed to be efficient and use minimal power, allowing the system to run on small, low-cost devices close to the user (at the "edge" rather than in a data center). This makes DISHA practical for real-world use, unlike some earlier navigation assistants that required bulky or power-hungry equipment.

DISHA has been trained specifically for the challenges of outdoor navigation, like detecting sidewalks, obstacles, and other important features. By focusing the system on these key tasks, the researchers were able to make it more accurate and responsive than a general-purpose computer vision system would be.

The overall goal of DISHA is to give visually impaired individuals greater independence and freedom of movement in outdoor settings, empowering them to navigate safely and confidently. This builds on previous research on assistive technology and visual navigation for the visually impaired.

Technical Explanation

The core of DISHA is a sparse Transformer model that takes input from a camera and produces navigation guidance. Transformers are a type of deep learning architecture that has shown great success in computer vision and other domains.

To make the Transformer efficient enough to run on edge devices, the researchers used a sparse attention mechanism that selectively processes only the most relevant parts of the visual input. This reduces the computational load and power consumption compared to a dense Transformer.

The DISHA Transformer is trained on a large dataset of outdoor scenes, with annotations for key navigation features like sidewalks, obstacles, and traversable paths. This task-specific training allows the system to excel at the target application of outdoor navigation assistance.

Experiments showed that DISHA can run in real-time on low-power edge devices while maintaining high accuracy on sidewalk detection and other key metrics. This makes it a practical solution for assistive navigation that visually impaired users can comfortably incorporate into their daily lives.

Critical Analysis

The DISHA paper presents a compelling solution to the challenge of outdoor navigation for the visually impaired. The sparse Transformer architecture and edge-computing design are innovative approaches that address key practical concerns around power consumption and form factor.

That said, the paper does not extensively discuss the system's performance in complex, real-world environments. The evaluation was primarily conducted in relatively controlled settings, so further research would be needed to understand how well DISHA handles the full diversity of outdoor scenes and obstacles.

Additionally, the paper does not provide much insight into the user experience or accessibility considerations beyond the technical system design. Gathering feedback from visually impaired individuals during the development process could help ensure DISHA meets their specific needs and preferences.

Overall, DISHA represents an important step forward in assistive navigation technology. With continued research and refinement, such systems have the potential to significantly improve the independence and quality of life for many visually impaired people.

Conclusion

The DISHA paper introduces a novel low-power, edge-based Transformer system for outdoor navigation assistance for the visually impaired. By leveraging a sparse attention mechanism and task-specific training, the researchers have created a practical solution that can run in real-time on small, portable devices.

This advance in assistive technology has the potential to empower visually impaired individuals with greater independence and freedom of movement. Further development and user testing will be crucial to refine the system and ensure it meets the diverse needs of the target population.

Overall, DISHA represents an exciting step forward in making outdoor navigation more accessible and inclusive for those with visual impairments. As this technology continues to evolve, it could have a profound impact on improving quality of life and enhancing social participation for a underserved community.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

DISHA: Low-Energy Sparse Transformer at Edge for Outdoor Navigation for the Visually Impaired Individuals
Total Score

0

DISHA: Low-Energy Sparse Transformer at Edge for Outdoor Navigation for the Visually Impaired Individuals

Praveen Nagil, Sumit K. Mandal

Assistive technology for visually impaired individuals is extremely useful to make them independent of another human being in performing day-to-day chores and instill confidence in them. One of the important aspects of assistive technology is outdoor navigation for visually impaired people. While there exist several techniques for outdoor navigation in the literature, they are mainly limited to obstacle detection. However, navigating a visually impaired person through the sidewalk (while the person is walking outside) is important too. Moreover, the assistive technology should ensure low-energy operation to extend the battery life of the device. Therefore, in this work, we propose an end-to-end technology deployed on an edge device to assist visually impaired people. Specifically, we propose a novel pruning technique for transformer algorithm which detects sidewalk. The pruning technique ensures low latency of execution and low energy consumption when the pruned transformer algorithm is deployed on the edge device. Extensive experimental evaluation shows that our proposed technology provides up to 32.49% improvement in accuracy and 1.4 hours of extension in battery life with respect to a baseline technique.

Read more

6/26/2024

📊

Total Score

0

StreetNav: Leveraging Street Cameras to Support Precise Outdoor Navigation for Blind Pedestrians

Gaurav Jain, Basel Hindi, Zihao Zhang, Koushik Srinivasula, Mingyu Xie, Mahshid Ghasemi, Daniel Weiner, Sophie Ana Paris, Xin Yi Therese Xu, Michael Malcolm, Mehmet Turkcan, Javad Ghaderi, Zoran Kostic, Gil Zussman, Brian A. Smith

Blind and low-vision (BLV) people rely on GPS-based systems for outdoor navigation. GPS's inaccuracy, however, causes them to veer off track, run into obstacles, and struggle to reach precise destinations. While prior work has made precise navigation possible indoors via hardware installations, enabling this outdoors remains a challenge. Interestingly, many outdoor environments are already instrumented with hardware such as street cameras. In this work, we explore the idea of repurposing existing street cameras for outdoor navigation. Our community-driven approach considers both technical and sociotechnical concerns through engagements with various stakeholders: BLV users, residents, business owners, and Community Board leadership. The resulting system, StreetNav, processes a camera's video feed using computer vision and gives BLV pedestrians real-time navigation assistance. Our evaluations show that StreetNav guides users more precisely than GPS, but its technical performance is sensitive to environmental occlusions and distance from the camera. We discuss future implications for deploying such systems at scale.

Read more

7/31/2024

Vision-based Wearable Steering Assistance for People with Impaired Vision in Jogging
Total Score

0

Vision-based Wearable Steering Assistance for People with Impaired Vision in Jogging

Xiaotong Liu, Binglu Wang, Zhijun Li

Outdoor sports pose a challenge for people with impaired vision. The demand for higher-speed mobility inspired us to develop a vision-based wearable steering assistance. To ensure broad applicability, we focused on a representative sports environment, the athletics track. Our efforts centered on improving the speed and accuracy of perception, enhancing planning adaptability for the real world, and providing swift and safe assistance for people with impaired vision. In perception, we engineered a lightweight multitask network capable of simultaneously detecting track lines and obstacles. Additionally, due to the limitations of existing datasets for supporting multi-task detection in athletics tracks, we diligently collected and annotated a new dataset (MAT) containing 1000 images. In planning, we integrated the methods of sampling and spline curves, addressing the planning challenges of curves. Meanwhile, we utilized the positions of the track lines and obstacles as constraints to guide people with impaired vision safely along the current track. Our system is deployed on an embedded device, Jetson Orin NX. Through outdoor experiments, it demonstrated adaptability in different sports scenarios, assisting users in achieving free movement of 400-meter at an average speed of 1.34 m/s, meeting the level of normal people in jogging. Our MAT dataset is publicly available from https://github.com/snoopy-l/MAT

Read more

8/2/2024

TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments
Total Score

0

TGS: Trajectory Generation and Selection using Vision Language Models in Mapless Outdoor Environments

Daeun Song, Jing Liang, Xuesu Xiao, Dinesh Manocha

We present a multi-modal trajectory generation and selection algorithm for real-world mapless outdoor navigation in challenging scenarios with unstructured off-road features like buildings, grass, and curbs. Our goal is to compute suitable trajectories that (1) satisfy the environment-specific traversability constraints and (2) generate human-like paths while navigating in crosswalks, sidewalks, etc. Our formulation uses a Conditional Variational Autoencoder (CVAE) generative model enhanced with traversability constraints to generate multiple candidate trajectories for global navigation. We use VLMs and a visual prompting approach with their zero-shot ability of semantic understanding and logical reasoning to choose the best trajectory given the contextual information about the task. We evaluate our methods in various outdoor scenes with wheeled robots and compare the performance with other global navigation algorithms. In practice, we observe at least 3.35% improvement in traversability and 20.61% improvement in terms of human-like navigation in generated trajectories in challenging outdoor navigation scenarios.

Read more

8/9/2024