EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels

2406.07151

Published 6/12/2024 by Shuqi Zhu, Ziyi Ye, Qingyao Ai, Yiqun Liu

EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels

Abstract

Identifying and reconstructing what we see from brain activity gives us a special insight into investigating how the biological visual system represents the world. While recent efforts have achieved high-performance image classification and high-quality image reconstruction from brain signals collected by Functional Magnetic Resonance Imaging (fMRI) or magnetoencephalogram (MEG), the expensiveness and bulkiness of these devices make relevant applications difficult to generalize to practical applications. On the other hand, Electroencephalography (EEG), despite its advantages of ease of use, cost-efficiency, high temporal resolution, and non-invasive nature, has not been fully explored in relevant studies due to the lack of comprehensive datasets. To address this gap, we introduce EEG-ImageNet, a novel EEG dataset comprising recordings from 16 subjects exposed to 4000 images selected from the ImageNet dataset. EEG-ImageNet consists of 5 times EEG-image pairs larger than existing similar EEG benchmarks. EEG-ImageNet is collected with image stimuli of multi-granularity labels, i.e., 40 images with coarse-grained labels and 40 with fine-grained labels. Based on it, we establish benchmarks for object classification and image reconstruction. Experiments with several commonly used models show that the best models can achieve object classification with accuracy around 60% and image reconstruction with two-way identification around 64%. These results demonstrate the dataset's potential to advance EEG-based visual brain-computer interfaces, understand the visual perception of biological systems, and provide potential applications in improving machine visual models.

Create account to get full access

Overview

This paper introduces a new dataset called "EEG-ImageNet" that contains electroencephalogram (EEG) recordings and associated image data with multi-granular labels.
The dataset is intended to serve as a benchmark for research on brain-computer interfaces and decoding visual perception from EEG signals.
The paper also presents several baselines and experimental results using the dataset, demonstrating its usefulness for evaluating machine learning models in this domain.

Plain English Explanation

The researchers have created a new dataset that combines brain activity data (called EEG) with images that people looked at. This dataset is designed to help study how the brain processes and understands visual information.

By having both the brain data and the images people saw, researchers can try to 'decode' or figure out what the brain is doing when it sees different types of images. This could be useful for building brain-computer interfaces, where a computer system can interpret a person's brain activity to control something, like a prosthetic limb or a computer interface.

The dataset includes images with multiple levels of detail in the labels, from broad categories like "animal" down to more specific labels like the exact animal species. This allows researchers to explore how the brain represents different levels of visual information.

The researchers also provide some initial results using machine learning models to analyze the EEG data and predict the images people were looking at. This demonstrates the potential of the dataset to advance research in this area.

Technical Explanation

The EEG-ImageNet dataset is a new benchmark for studying the decoding of visual perception from electroencephalogram (EEG) signals. It contains EEG recordings from participants viewing a set of natural images, along with multi-granular labels for the images ranging from broad categories to fine-grained object classes.

The dataset was designed to enable research on brain-computer interfaces and decoding of visual perception from EEG signals. The multi-granular labels allow for the exploration of how the brain represents visual information at different levels of abstraction.

The paper presents several baseline models for classifying the viewed images based on the EEG data, including convolutional neural networks and recurrent neural networks. The results demonstrate the potential of the EEG-ImageNet dataset to advance research in this area.

Critical Analysis

The EEG-ImageNet dataset and the baselines presented in the paper make valuable contributions to the field of brain-computer interfaces and visual perception decoding from EEG signals. However, the authors acknowledge several limitations and areas for further research.

One key limitation is the relatively small size of the dataset compared to large-scale computer vision benchmarks like ImageNet. Expanding the dataset with more participants and images could improve the robustness and generalizability of the models.

Additionally, the paper does not explore the temporal dynamics of the EEG signals in depth, focusing primarily on static classification tasks. Incorporating the temporal information may lead to further improvements in decoding performance.

The authors also note that the current baselines do not leverage the multi-granular labels to their full potential. Developing models that can jointly predict the broad category and fine-grained class of the viewed image could provide deeper insights into the brain's visual representations.

Despite these limitations, the EEG-ImageNet dataset and the presented research establish a solid foundation for future work in this field. Continued advancements in this area could have important implications for brain-computer interfaces, cognitive neuroscience, and our understanding of human visual perception.

Conclusion

The EEG-ImageNet dataset and the research presented in this paper represent a significant step forward in the field of brain-computer interfaces and decoding of visual perception from EEG signals. By providing a large-scale, multi-granular dataset and baseline models, the authors have created a valuable resource for the research community to build upon.

The potential applications of this work span from assistive technologies for individuals with disabilities to fundamental insights into the neural mechanisms underlying human visual cognition. As the field continues to evolve, the EEG-ImageNet dataset and the lessons learned from this research will undoubtedly contribute to the development of more advanced brain-computer interfaces and our understanding of the human brain.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🌿

Decoding Natural Images from EEG for Object Recognition

Yonghao Song, Bingchuan Liu, Xiang Li, Nanlin Shi, Yijun Wang, Xiaorong Gao

Electroencephalography (EEG) signals, known for convenient non-invasive acquisition but low signal-to-noise ratio, have recently gained substantial attention due to the potential to decode natural images. This paper presents a self-supervised framework to demonstrate the feasibility of learning image representations from EEG signals, particularly for object recognition. The framework utilizes image and EEG encoders to extract features from paired image stimuli and EEG responses. Contrastive learning aligns these two modalities by constraining their similarity. With the framework, we attain significantly above-chance results on a comprehensive EEG-image dataset, achieving a top-1 accuracy of 15.6% and a top-5 accuracy of 42.8% in challenging 200-way zero-shot tasks. Moreover, we perform extensive experiments to explore the biological plausibility by resolving the temporal, spatial, spectral, and semantic aspects of EEG signals. Besides, we introduce attention modules to capture spatial correlations, providing implicit evidence of the brain activity perceived from EEG data. These findings yield valuable insights for neural decoding and brain-computer interfaces in real-world scenarios. The code will be released on https://github.com/eeyhsong/NICE-EEG.

4/5/2024

cs.HC cs.AI eess.SP

Mind's Eye: Image Recognition by EEG via Multimodal Similarity-Keeping Contrastive Learning

Chi-Sheng Chen, Chun-Shu Wei

Decoding images from non-invasive electroencephalographic (EEG) signals has been a grand challenge in understanding how the human brain process visual information in real-world scenarios. To cope with the issues of signal-to-noise ratio and nonstationarity, this paper introduces a MUltimodal Similarity-keeping contrastivE learning (MUSE) framework for zero-shot EEG-based image classification. We develop a series of multivariate time-series encoders tailored for EEG signals and assess the efficacy of regularized contrastive EEG-Image pretraining using an extensive visual EEG dataset. Our method achieves state-of-the-art performance, with a top-1 accuracy of 19.3% and a top-5 accuracy of 48.8% in 200-way zero-shot image classification. Furthermore, we visualize neural patterns via model interpretation, shedding light on the visual processing dynamics in the human brain. The code repository for this work is available at: https://github.com/ChiShengChen/MUSE_EEG.

6/26/2024

eess.SP cs.AI cs.HC cs.LG

EEG_RL-Net: Enhancing EEG MI Classification through Reinforcement Learning-Optimised Graph Neural Networks

Htoo Wai Aung, Jiao Jiao Li, Yang An, Steven W. Su

Brain-Computer Interfaces (BCIs) rely on accurately decoding electroencephalography (EEG) motor imagery (MI) signals for effective device control. Graph Neural Networks (GNNs) outperform Convolutional Neural Networks (CNNs) in this regard, by leveraging the spatial relationships between EEG electrodes through adjacency matrices. The EEG_GLT-Net framework, featuring the state-of-the-art EEG_GLT adjacency matrix method, has notably enhanced EEG MI signal classification, evidenced by an average accuracy of 83.95% across 20 subjects on the PhysioNet dataset. This significantly exceeds the 76.10% accuracy rate achieved using the Pearson Correlation Coefficient (PCC) method within the same framework. In this research, we advance the field by applying a Reinforcement Learning (RL) approach to the classification of EEG MI signals. Our innovative method empowers the RL agent, enabling not only the classification of EEG MI data points with higher accuracy, but effective identification of EEG MI data points that are less distinct. We present the EEG_RL-Net, an enhancement of the EEG_GLT-Net framework, which incorporates the trained EEG GCN Block from EEG_GLT-Net at an adjacency matrix density of 13.39% alongside the RL-centric Dueling Deep Q Network (Dueling DQN) block. The EEG_RL-Net model showcases exceptional classification performance, achieving an unprecedented average accuracy of 96.40% across 20 subjects within 25 milliseconds. This model illustrates the transformative effect of the RL in EEG MI time point classification.

5/3/2024

eess.SP cs.AI cs.LG

Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion

Dongyang Li, Chen Wei, Shiying Li, Jiachen Zou, Quanying Liu

How to decode human vision through neural signals has attracted a long-standing interest in neuroscience and machine learning. Modern contrastive learning and generative models improved the performance of fMRI-based visual decoding and reconstruction. However, the high cost and low temporal resolution of fMRI limit their applications in brain-computer interfaces (BCIs), prompting a high need for EEG-based visual reconstruction. In this study, we present an EEG-based visual reconstruction framework. It consists of a plug-and-play EEG encoder called the Adaptive Thinking Mapper (ATM), which is aligned with image embeddings, and a two-stage EEG guidance image generator that first transforms EEG features into image priors and then reconstructs the visual stimuli with a pre-trained image generator. Our approach allows EEG embeddings to achieve superior performance in image classification and retrieval tasks. Our two-stage image generation strategy vividly reconstructs images seen by humans. Furthermore, we analyzed the impact of signals from different time windows and brain regions on decoding and reconstruction. The versatility of our framework is demonstrated in the magnetoencephalogram (MEG) data modality. We report that EEG-based visual decoding achieves SOTA performance, highlighting the portability, low cost, and high temporal resolution of EEG, enabling a wide range of BCI applications. The code of ATM is available at https://github.com/dongyangli-del/EEG_Image_decode.

4/8/2024

cs.HC eess.SP