Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

CognitionCapturer: Decoding Visual Stimuli from Human EEG Signal with Multimodal Information

Authors: Kaifan Zhang, Lihuo He, Xin Jiang, Wen Lu, Di Wang, Xinbo Gao

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, we demonstrate that Cognition Capturer outperforms state-of-the-art methods both qualitatively and quantitatively.
Researcher Affiliation Academia Kaifan Zhang1, Lihuo He1*, Xin Jiang1, Wen Lu1, Di Wang2, Xinbo Gao1, 3 1School of Electronic Engineering, Xidian University, Xi an, China 2School of Computer Science and Technology, Xidian University, Xi an, China 3Chongqing University of Posts and Telecommunications, Chongqing, China EMAIL, EMAIL
Pseudocode No The paper describes the methodology in narrative text and using a block diagram (Figure 2), but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code https://github.com/Xiao Zhang YES/Cognition Capturer
Open Datasets Yes We utilized Thing-EEG Dataset for our experiments. The Thing-EEG dataset (Gifford et al. 2022) contains EEG data collected from 10 subjects under an RSVP paradigm.
Dataset Splits Yes The training set comprises 1654 concepts, each associated with 10 images presented four times, resulting in a total of 66,160 EEG recordings. The test set includes 200 unique concepts, each represented by a single image repeated 80 times, totaling 16,000 EEG recordings.
Hardware Specification Yes The proposed method is conducted on a single Ge Force RTX 2080 Ti GPU.
Software Dependencies No The paper mentions several models and tools such as 'Open CLIP Vi T-H/14', 'BLIP2', 'Depth Anything', 'SDXL-Turbo', and 'IP-Adapters' but does not specify their version numbers or the version numbers of underlying programming languages or libraries (e.g., Python, PyTorch).
Experiment Setup Yes For the training of the modality expert encoder phase, we used the Adam W optimizer with a learning rate of 0.0003, a batch size of 1024, and trained for 20 epochs... During the training of the diffusion prior, we used a batch size of 512, trained for 100 epochs, and set the number of inference steps to 50. The guidance scale was set to 7.5... We set the inference steps for SDXL-Turbo to 5. When configuring the IP-Adapter, for the image modality, we used the full IP-Adapter with the scale set to 1. For the text and depth modalities, we set the scale of their respective IP-Adapter s Layout block and Style block to 0...