Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisation

Read original: arXiv:2408.12463 - Published 8/23/2024 by Nishan Gunawardena, Gough Yumu Lui, Jeewani Anupama Ginige, Bahman Javadi

Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisation

Overview

Develops a smartphone-based eye tracking system using edge intelligence and model optimization
Enables accurate and real-time eye tracking on smartphones without relying on the cloud
Optimizes the deep learning model to run efficiently on mobile devices

Plain English Explanation

This paper presents a new smartphone-based eye tracking system that uses edge intelligence and model optimization techniques. The goal is to enable accurate and real-time eye tracking on smartphones without needing to send data to the cloud.

The key idea is to run the eye tracking algorithm directly on the smartphone, rather than relying on a remote server. This enhances the speed and accuracy of the eye tracking compared to cloud-based approaches. To make this work, the researchers optimized the deep learning model to run efficiently on mobile devices, reducing the computational and memory requirements.

Technical Explanation

The system uses a convolutional neural network (CNN) to detect and track the user's eyes from the smartphone's front-facing camera. To deploy this model on a smartphone, the researchers applied several optimization techniques:

Model Pruning: They reduced the size of the CNN model by removing unnecessary parameters and connections, without significantly impacting performance.
Quantization: They converted the model's weights and activations from floating-point to lower-precision integer formats, reducing the memory footprint and computation time.
Architecture Search: They used an automated neural architecture search to find an efficient model design tailored for mobile devices.

These optimizations enabled the eye tracking system to run in real-time on the smartphone, with minimal latency and power consumption.

Critical Analysis

The paper provides a thorough evaluation of the optimized eye tracking system, including comparisons to cloud-based approaches and other mobile eye trackers. The results demonstrate significant improvements in speed, accuracy, and power efficiency.

However, the researchers do not discuss the potential privacy and security implications of performing sensitive eye tracking computations on the user's personal device. There may be concerns about data privacy and the risk of malicious apps exploiting the eye tracking capabilities.

Additionally, the paper does not explore the generalization of the optimization techniques to other mobile vision tasks beyond eye tracking. Further research could investigate the broader applicability of these methods to a wider range of mobile computer vision applications.

Conclusion

This paper presents an innovative smartphone-based eye tracking system that leverages edge intelligence and model optimization to achieve real-time performance and high accuracy. The techniques described could have important implications for developing efficient and privacy-preserving mobile computer vision applications. Future work should address the potential privacy concerns and explore the generalization of these optimization methods to other mobile vision tasks.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Smartphone-based Eye Tracking System using Edge Intelligence and Model Optimisation

Nishan Gunawardena, Gough Yumu Lui, Jeewani Anupama Ginige, Bahman Javadi

A significant limitation of current smartphone-based eye-tracking algorithms is their low accuracy when applied to video-type visual stimuli, as they are typically trained on static images. Also, the increasing demand for real-time interactive applications like games, VR, and AR on smartphones requires overcoming the limitations posed by resource constraints such as limited computational power, battery life, and network bandwidth. Therefore, we developed two new smartphone eye-tracking techniques for video-type visuals by combining Convolutional Neural Networks (CNN) with two different Recurrent Neural Networks (RNN), namely Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU). Our CNN+LSTM and CNN+GRU models achieved an average Root Mean Square Error of 0.955cm and 1.091cm, respectively. To address the computational constraints of smartphones, we developed an edge intelligence architecture to enhance the performance of smartphone-based eye tracking. We applied various optimisation methods like quantisation and pruning to deep learning models for better energy, CPU, and memory usage on edge devices, focusing on real-time processing. Using model quantisation, the model inference time in the CNN+LSTM and CNN+GRU models was reduced by 21.72% and 19.50%, respectively, on edge devices.

8/23/2024

🤿

Open Gaze: Open Source eye tracker for smartphone devices using Deep Learning

Sushmanth reddy, Jyothi Swaroop Reddy

Eye tracking has been a pivotal tool in diverse fields such as vision research, language analysis, and usability assessment. The majority of prior investigations, however, have concentrated on expansive desktop displays employing specialized, costly eye tracking hardware that lacks scalability. Remarkably little insight exists into ocular movement patterns on smartphones, despite their widespread adoption and significant usage. In this manuscript, we present an open-source implementation of a smartphone-based gaze tracker that emulates the methodology proposed by a GooglePaper (whose source code remains proprietary). Our focus is on attaining accuracy comparable to that attained through the GooglePaper's methodology, without the necessity for supplementary hardware. Through the integration of machine learning techniques, we unveil an accurate eye tracking solution that is native to smartphones. Our approach demonstrates precision akin to the state-of-the-art mobile eye trackers, which are characterized by a cost that is two orders of magnitude higher. Leveraging the vast MIT GazeCapture dataset, which is available through registration on the dataset's website, we successfully replicate crucial findings from previous studies concerning ocular motion behavior in oculomotor tasks and saliency analyses during natural image observation. Furthermore, we emphasize the applicability of smartphone-based gaze tracking in discerning reading comprehension challenges. Our findings exhibit the inherent potential to amplify eye movement research by significant proportions, accommodating participation from thousands of subjects with explicit consent. This scalability not only fosters advancements in vision research, but also extends its benefits to domains such as accessibility enhancement and healthcare applications.

9/5/2024

Co-designing a Sub-millisecond Latency Event-based Eye Tracking System with Submanifold Sparse CNN

Baoheng Zhang, Yizhao Gao, Jingyuan Li, Hayden Kwok-Hay So

Eye-tracking technology is integral to numerous consumer electronics applications, particularly in the realm of virtual and augmented reality (VR/AR). These applications demand solutions that excel in three crucial aspects: low-latency, low-power consumption, and precision. Yet, achieving optimal performance across all these fronts presents a formidable challenge, necessitating a balance between sophisticated algorithms and efficient backend hardware implementations. In this study, we tackle this challenge through a synergistic software/hardware co-design of the system with an event camera. Leveraging the inherent sparsity of event-based input data, we integrate a novel sparse FPGA dataflow accelerator customized for submanifold sparse convolution neural networks (SCNN). The SCNN implemented on the accelerator can efficiently extract the embedding feature vector from each representation of event slices by only processing the non-zero activations. Subsequently, these vectors undergo further processing by a gated recurrent unit (GRU) and a fully connected layer on the host CPU to generate the eye centers. Deployment and evaluation of our system reveal outstanding performance metrics. On the Event-based Eye-Tracking-AIS2024 dataset, our system achieves 81% p5 accuracy, 99.5% p10 accuracy, and 3.71 Mean Euclidean Distance with 0.7 ms latency while only consuming 2.29 mJ per inference. Notably, our solution opens up opportunities for future eye-tracking systems. Code is available at https://github.com/CASR-HKU/ESDA/tree/eye_tracking.

4/23/2024

EEGMobile: Enhancing Speed and Accuracy in EEG-Based Gaze Prediction with Advanced Mobile Architectures

Teng Liang, Andrews Damoah

Electroencephalography (EEG) analysis is an important domain in the realm of Brain-Computer Interface (BCI) research. To ensure BCI devices are capable of providing practical applications in the real world, brain signal processing techniques must be fast, accurate, and resource-conscious to deliver low-latency neural analytics. This study presents a model that leverages a pre-trained MobileViT alongside Knowledge Distillation (KD) for EEG regression tasks. Our results showcase that this model is capable of performing at a level comparable (only 3% lower) to the previous State-Of-The-Art (SOTA) on the EEGEyeNet Absolute Position Task while being 33% faster and 60% smaller. Our research presents a cost-effective model applicable to resource-constrained devices and contributes to expanding future research on lightweight, mobile-friendly models for EEG regression.

8/9/2024