GazeIntent: Adapting dwell-time selection in VR interaction with real-time intent modeling

Read original: arXiv:2404.13829 - Published 4/23/2024 by Anish S. Narkar, Jan J. Michalak, Candace E. Peacock, Brendan David-John

GazeIntent: Adapting dwell-time selection in VR interaction with real-time intent modeling

Overview

This paper introduces GazeIntent, a system that adapts dwell-time selection in virtual reality (VR) interactions using real-time intent modeling.
The researchers developed a predictive model to estimate a user's intent to interact with virtual objects based on their gaze behavior.
GazeIntent aims to improve the efficiency and user experience of gaze-based interactions in VR by adjusting the dwell-time required to select an object based on the predicted intent.

Plain English Explanation

The paper presents a system called GazeIntent that improves how people control virtual reality (VR) systems using their eyes. In VR, users often have to stare at an object for a certain amount of time (called "dwell time") to select it. GazeIntent uses machine learning to predict the user's intent - whether they actually want to select an object or not - based on how they are looking around. If GazeIntent thinks the user intends to select an object, it can reduce the dwell time required, making the interaction faster and more natural. This could be helpful for gaze-guided interactions in mixed reality where users need to control virtual interfaces with their eyes.

Technical Explanation

The key components of GazeIntent are:

Gaze Tracking: The system tracks the user's gaze behavior in real-time using eye-tracking hardware in the VR headset.
Intent Prediction: GazeIntent uses a machine learning model to predict the user's intent to interact with a virtual object based on their recent gaze patterns. This model was trained on data collected from user studies.
Dwell-time Adaptation: The system dynamically adjusts the dwell time required to select an object based on the predicted intent. If the intent model thinks the user wants to interact, the dwell time is reduced, enabling faster selections.

The researchers evaluated GazeIntent in a user study where participants performed various VR interaction tasks. The results showed that GazeIntent improved task completion times and reduced the cognitive load on users compared to a standard dwell-time selection approach.

Critical Analysis

The paper provides a promising approach to enhancing gaze-based interactions in VR, but there are a few aspects that could be explored further:

The intent prediction model was trained on data from a specific set of tasks, so its generalizability to a wider range of VR interactions is unclear. Additional testing in more diverse scenarios would help validate the approach.
The paper does not address potential issues with the reliability and accuracy of gaze tracking in dynamic VR environments, which could impact the intent prediction and dwell-time adaptation.
While the user study results are positive, the sample size is relatively small. Larger-scale evaluations would help strengthen the confidence in the system's performance.

Conclusion

Overall, the GazeIntent system represents an innovative approach to improving gaze-based interactions in virtual reality. By using real-time intent modeling to adapt the dwell-time selection, the researchers have demonstrated a way to enhance the efficiency and user experience of controlling VR interfaces with the eyes alone. Further refinement and testing of the approach could lead to more naturalistic and responsive gaze-based interfaces for immersive computing.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

GazeIntent: Adapting dwell-time selection in VR interaction with real-time intent modeling

Anish S. Narkar, Jan J. Michalak, Candace E. Peacock, Brendan David-John

The use of ML models to predict a user's cognitive state from behavioral data has been studied for various applications which includes predicting the intent to perform selections in VR. We developed a novel technique that uses gaze-based intent models to adapt dwell-time thresholds to aid gaze-only selection. A dataset of users performing selection in arithmetic tasks was used to develop intent prediction models (F1 = 0.94). We developed GazeIntent to adapt selection dwell times based on intent model outputs and conducted an end-user study with returning and new users performing additional tasks with varied selection frequencies. Personalized models for returning users effectively accounted for prior experience and were preferred by 63% of users. Our work provides the field with methods to adapt dwell-based selection to users, account for experience over time, and consider tasks that vary by selection frequency

4/23/2024

⛏️

Time Matters: Enhancing Pre-trained News Recommendation Models with Robust User Dwell Time Injection

Hao Jiang, Chuanzhen Li, Mingxiao An

Large Language Models (LLMs) have revolutionized text comprehension, leading to State-of-the-Art (SOTA) news recommendation models that utilize LLMs for in-depth news understanding. Despite this, accurately modeling user preferences remains challenging due to the inherent uncertainty of click behaviors. Techniques like multi-head attention in Transformers seek to alleviate this by capturing interactions among clicks, yet they fall short in integrating explicit feedback signals. User Dwell Time emerges as a powerful indicator, offering the potential to enhance the weak signals emanating from clicks. Nonetheless, its real-world applicability is questionable, especially when dwell time data collection is subject to delays. To bridge this gap, this paper proposes two novel and robust dwell time injection strategies, namely Dwell time Weight (DweW) and Dwell time Aware (DweA). Dwe} concentrates on refining Effective User Clicks through detailed analysis of dwell time, integrating with initial behavioral inputs to construct a more robust user preference. DweA empowers the model with awareness of dwell time information, thereby facilitating autonomous adjustment of attention values in user modeling. This enhancement sharpens the model's ability to accurately identify user preferences. In our experiment using the real-world news dataset from MSN website, we validated that our two strategies significantly improve recommendation performance, favoring high-quality news. Crucially, our approaches exhibit robustness to user dwell time information, maintaining their ability to recommend high-quality content even in extreme cases where dwell time data is entirely missing.

5/22/2024

Predicting the Intention to Interact with a Service Robot:the Role of Gaze Cues

Simone Arreghini, Gabriele Abbate, Alessandro Giusti, Antonio Paolillo

For a service robot, it is crucial to perceive as early as possible that an approaching person intends to interact: in this case, it can proactively enact friendly behaviors that lead to an improved user experience. We solve this perception task with a sequence-to-sequence classifier of a potential user intention to interact, which can be trained in a self-supervised way. Our main contribution is a study of the benefit of features representing the person's gaze in this context. Extensive experiments on a novel dataset show that the inclusion of gaze cues significantly improves the classifier performance (AUROC increases from 84.5% to 91.2%); the distance at which an accurate classification can be achieved improves from 2.4 m to 3.2 m. We also quantify the system's ability to adapt to new environments without external supervision. Qualitative experiments show practical applications with a waiter robot.

4/3/2024

Explainable Interfaces for Rapid Gaze-Based Interactions in Mixed Reality

Mengjie Yu, Dustin Harris, Ian Jones, Ting Zhang, Yue Liu, Naveen Sendhilnathan, Narine Kokhlikyan, Fulton Wang, Co Tran, Jordan L. Livingston, Krista E. Taylor, Zhenhong Hu, Mary A. Hood, Hrvoje Benko, Tanya R. Jonker

Gaze-based interactions offer a potential way for users to naturally engage with mixed reality (XR) interfaces. Black-box machine learning models enabled higher accuracy for gaze-based interactions. However, due to the black-box nature of the model, users might not be able to understand and effectively adapt their gaze behaviour to achieve high quality interaction. We posit that explainable AI (XAI) techniques can facilitate understanding of and interaction with gaze-based model-driven system in XR. To study this, we built a real-time, multi-level XAI interface for gaze-based interaction using a deep learning model, and evaluated it during a visual search task in XR. A between-subjects study revealed that participants who interacted with XAI made more accurate selections compared to those who did not use the XAI system (i.e., F1 score increase of 10.8%). Additionally, participants who used the XAI system adapted their gaze behavior over time to make more effective selections. These findings suggest that XAI can potentially be used to assist users in more effective collaboration with model-driven interactions in XR.

4/23/2024