CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

Read original: arXiv:2407.13515 - Published 7/30/2024 by Jaewook Lee, Andrew D. Tjahjadi, Jiho Kim, Junpu Yu, Minji Park, Jiawen Zhang, Jon E. Froehlich, Yapeng Tian, Yuhang Zhao

CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

Overview

This paper presents CookAR, an augmented reality (AR) system designed to assist people with low vision when using kitchen tools.
The system uses computer vision techniques to detect and segment the affordances (actionable properties) of kitchen tools, and then overlays visual cues and information to help users interact with them more effectively.
The researchers conducted a user study to evaluate the effectiveness of CookAR in supporting kitchen tasks for individuals with low vision.

Plain English Explanation

CookAR is an augmented reality (AR) system that aims to help people with low vision perform kitchen tasks more easily. Many kitchen tools and appliances can be challenging for those with visual impairments to use, as it can be difficult to identify the key features and how to properly interact with them.

The CookAR system addresses this by using computer vision to analyze the kitchen environment and detect the "affordances" of different tools. Affordances are the actionable properties of an object - for example, the handle of a pot is an affordance that allows you to pick it up and move it. CookAR identifies these affordances and then overlays visual cues and information on top of the user's view through an AR headset. This could include highlighting the handles, buttons, or other key parts of a tool to make them more visible and intuitive to use.

The researchers behind CookAR conducted a user study to see how well the system worked in practice. They had participants with low vision try out CookAR while performing common cooking tasks, and gathered feedback on its usefulness and ease of use. The results suggest that CookAR can significantly improve task performance and user satisfaction compared to using the kitchen tools alone.

Overall, CookAR demonstrates how augmented reality can be leveraged to make the physical world more accessible for people with visual impairments. By overlaying relevant information and visual cues, it helps bridge the gap between the user's perception of their environment and the actual affordances of the objects around them.

Technical Explanation

The CookAR system uses a combination of computer vision techniques to detect and segment the affordances of kitchen tools. It first performs object detection to identify the different tools and appliances in the user's view. It then uses instance segmentation to precisely outline the key parts of each tool, such as handles, buttons, and other interactive components.

This affordance information is then overlaid on the user's view through an augmented reality headset. CookAR uses various visualization techniques, including highlighting, labeling, and 3D annotations, to draw the user's attention to the relevant affordances and provide contextual information about how to interact with them.

The researchers evaluated CookAR through a user study with 12 participants who have low vision. Participants were asked to perform a series of cooking tasks, such as chopping vegetables and operating a stove, both with and without the CookAR system. The results showed that CookAR significantly improved task performance, reduced task completion time, and increased user satisfaction compared to the baseline condition.

The study also explored the impact of different visualization techniques and found that certain approaches, such as 3D annotations, were particularly helpful for participants. Additionally, the researchers gathered qualitative feedback on the usability and perceived benefits of the CookAR system.

Critical Analysis

The CookAR paper presents a promising approach to leveraging augmented reality to support individuals with low vision in the kitchen. The use of computer vision to detect and highlight the affordances of kitchen tools is a clever solution to a real-world accessibility challenge.

One potential limitation of the research is the relatively small sample size of the user study. While the results were statistically significant, a larger and more diverse participant pool would help strengthen the findings and provide a better understanding of how CookAR performs across different types of visual impairments.

Additionally, the paper does not delve into the technical details of the computer vision algorithms used, such as the specific neural network architectures or training data. A more in-depth look at the methodology could help other researchers build upon this work.

That said, the overall concept of CookAR is compelling, and the positive user feedback suggests that it could have a meaningful impact on the daily lives of people with low vision. Further refinement and validation of the system could lead to valuable assistive technology for the kitchen and beyond.

Conclusion

The CookAR system demonstrates how augmented reality can be leveraged to enhance the accessibility of physical environments for individuals with low vision. By using computer vision to detect and highlight the affordances of kitchen tools, CookAR provides visual cues and information to help users interact with their surroundings more effectively.

The user study results suggest that CookAR can significantly improve task performance and user satisfaction in the kitchen, potentially making daily activities like cooking more accessible and independent for people with visual impairments. While further research is needed to refine and validate the system, this work represents an important step towards creating more inclusive and adaptable technologies for the real world.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

Jaewook Lee, Andrew D. Tjahjadi, Jiho Kim, Junpu Yu, Minji Park, Jiawen Zhang, Jon E. Froehlich, Yapeng Tian, Yuhang Zhao

Cooking is a central activity of daily living, supporting independence as well as mental and physical health. However, prior work has highlighted key barriers for people with low vision (LV) to cook, particularly around safely interacting with tools, such as sharp knives or hot pans. Drawing on recent advancements in computer vision (CV), we present CookAR, a head-mounted AR system with real-time object affordance augmentations to support safe and efficient interactions with kitchen tools. To design and implement CookAR, we collected and annotated the first egocentric dataset of kitchen tool affordances, fine-tuned an affordance segmentation model, and developed an AR system with a stereo camera to generate visual augmentations. To validate CookAR, we conducted a technical evaluation of our fine-tuned model as well as a qualitative lab study with 10 LV participants for suitable augmentation design. Our technical evaluation demonstrates that our model outperforms the baseline on our tool affordance dataset, while our user study indicates a preference for affordance augmentations over the traditional whole object augmentations.

7/30/2024

👀

Methodology to Deploy CNN-Based Computer Vision Models on Immersive Wearable Devices

Kaveh Malek (Department of Mechanical Engineering, University of New Mexico, New Mexico), Fernando Moreu (Department of Civil, Construction and Environmental Engineering, University of New Mexico, New Mexico)

Convolutional Neural Network (CNN) models often lack the ability to incorporate human input, which can be addressed by Augmented Reality (AR) headsets. However, current AR headsets face limitations in processing power, which has prevented researchers from performing real-time, complex image recognition tasks using CNNs in AR headsets. This paper presents a method to deploy CNN models on AR headsets by training them on computers and transferring the optimized weight matrices to the headset. The approach transforms the image data and CNN layers into a one-dimensional format suitable for the AR platform. We demonstrate this method by training the LeNet-5 CNN model on the MNIST dataset using PyTorch and deploying it on a HoloLens AR headset. The results show that the model maintains an accuracy of approximately 98%, similar to its performance on a computer. This integration of CNN and AR enables real-time image processing on AR headsets, allowing for the incorporation of human input into AI models.

7/2/2024

A Recipe for Success? Exploring Strategies for Improving Non-Visual Access to Cooking Instructions

Franklin Mingzhe Li, Ashley Wang, Patrick Carrington, Shaun K. Kane

Cooking is an essential activity that enhances quality of life by enabling individuals to prepare their own meals. However, cooking often requires multitasking between cooking tasks and following instructions, which can be challenging to cooks with vision impairments if recipes or other instructions are inaccessible. To explore the practices and challenges of recipe access while cooking, we conducted semi-structured interviews with 20 people with vision impairments who have cooking experience and four cooking instructors at a vision rehabilitation center. We also asked participants to edit and give feedback on existing recipes. We revealed unique practices and challenges to accessing recipe information at different cooking stages, such as the heavy burden of hand-washing to interact with recipe readers. We also presented the preferred information representation and structure of recipes. We then highlighted design features of technological supports that could facilitate the development of more accessible kitchen technologies for recipe access. Our work contributes nuanced insights and design guidelines to enhance recipe accessibility for people with vision impairments.

7/30/2024

Ping! Your Food is Ready: Comparing Different Notification Techniques in 3D AR Cooking Environment

Aditya Raikwar, Lucas Plabst, Anil Ufuk Batmaz, Florian Niebling, Francisco R. Ortega

Implementing visual and audio notifications on augmented reality devices is a crucial element of intuitive and easy-to-use interfaces. In this paper, we explored creating intuitive interfaces through visual and audio notifications. The study evaluated user performance and preference across three conditions: visual notifications in fixed positions, visual notifications above objects, and no visual notifications with monaural sounds. The users were tasked with cooking and serving customers in an open-source Augmented-Reality sandbox environment called ARtisan Bistro. The results indicated that visual notifications above objects combined with localized audio feedback were the most effective and preferred method by participants. The findings highlight the importance of strategic placement of visual and audio notifications in AR, providing insights for engineers and developers to design intuitive 3D user interfaces.

9/18/2024