[ICASSP 2025] BrainVis: Exploring the Bridge between Brain and Visual Signals via Image Reconstruction [Link to paper]
Abstract: Analyzing and reconstructing visual stimuli from brain signals effectively advances our understanding of the human visual system. However, EEG signals are complex and contain significant noise, leading to substantial limitations in existing approaches of visual stimuli reconstruction from EEG. These limitations include difficulties in aligning EEG embeddings with fine-grained semantic information and a heavy reliance on additional large-scale datasets for training. To address these challenges, we propose a novel approach called BrainVis. This approach introduces a self-supervised paradigm to learn EEG time-domain features and incorporates frequency-domain features to enhance EEG representations. We also propose a multi-modal alignment method called semantic interpolation to achieve fine-grained semantic reconstruction. Additionally, we employ cascaded diffusion models to reconstruct images. Using only 9.1% of the training data required by previous mask modeling works, our proposed BrainVis outperforms state-of-the-art methods in both semantic fidelity reconstruction and generation quality.
Environment
We recommend installing 64-bit Python 3.10.12 and PyTorch 2.5.1. On a CUDA GPU machine, the following will do the trick:
pip install numpy==1.26.0
pip install ftfy==6.2.0
pip install omegaconf==2.3.0
pip install einops==0.8.0
pip install torchmetrics==1.4.0.post0
pip install pytorch-lightning==2.3.3
pip install transformers==4.42.4
pip install kornia==0.7.3
pip install diffusers==0.29.2
We have done all testing and development using A100 GPU.
Create paths
python create_path.py
Download required files
- CLIP. Place the "clip" folder in this project.
- Pre-trained stable diffusion model v1-5-pruned-emaonly. Place the "v1-5-pruned-emaonly.ckpt" to path "/pretrained_model".
- EEG-Image pairs dataset. Place "block_splits_by_image_all.pth", "block_splits_by_image_single.pth" and "eeg_5_95_std.pth" to path "/data/EEG".
- A copy of required ImageNet subset. Unzip it to path "/data/image".
Obtain the training data required for the alignment process
python imageBLIPtoCLIP.py
python imageLabeltoCLIP.py
- Run
train_freqencoder.pyto train the frequency encoder. - Run
main.pyto pre-train the time encoder. - Comment out "trainer.pretrain()" on line 59 of
main.py, and uncomment "trainer.finetune()" on line 61. Runmain.pyto fine-tune the time encoder. - Modify "_all" to "_single" in line 14 of
datautils.py, and change "default=0" to any number from 1 to 6 in line 19 to use a different single subject. Comment out line 61 inmain.pyand uncomment "trainer.finetune_timefreq()" on line 64. Runmain.pyto integrate the time and frequency models. - Comment out line 64 of
main.py, and uncomment "trainer.finetune_CLIP()" on line 65. Runmain.pyto conduct cross-modal EEG alignment. - Modify the "train_mode=" to "False" on line 56 of
main.pyand run it to save the alignment results for reconstruction.
python cascade_diffusion.py
Results will be saved in the path "/picture-gene".
Please leave your email address in the issue (we will respond as soon as possible), or contact us directly via email at hfu006@e.ntu.edu.sg (some emails might be missed). We will send you the checkpoint along with the usage instructions.
BrainVis builds upon several previous works:
- High-resolution image synthesis with latent diffusion models (CVPR 2022)
- Learning Transferable Visual Models From Natural Language Supervision (ICML 2021)
- Seeing beyond the brain: Masked modeling conditioned diffusion model for human vision decoding (CVPR 2023)
- Deep learning human mind for automated visual classification (CVPR 2017)
- TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders
If you find BrainVis useful for your research, we would greatly appreciate it if you could star it on GitHub and cite using this BibTeX.
@inproceedings{fu2025brainvis,
title={BrainVis: Exploring the bridge between brain and visual signals via image reconstruction},
author={Fu, Honghao and Wang, Hao and Chin, Jing Jih and Shen, Zhiqi},
booktitle={ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={1--5},
year={2025},
organization={IEEE}
}

