Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
See Through Their Minds: Learning Transferable Brain Decoding Models from Cross-Subject fMRI
Authors: Yulong Liu, Yongqiang Ma, Guibo Zhu, Haodong Jing, Nanning Zheng
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments reveal several interesting insights: 1) Training with cross-subject f MRI benefits both high-level and low-level decoding models; 2) Merging high-level and low-level information improves reconstruction performance at both levels; 3) Transfer learning is effective for new subjects with limited training data by training new adapters; 4) Decoders trained on visually-elicited brain activity can generalize to decode imagery-induced activity, though with reduced performance. |
| Researcher Affiliation | Academia | Yulong Liu1,2,3, Yongqiang Ma1,2,3, Guibo Zhu4,5*, Haodong Jing1,2,3, Nanning Zheng1,2,3 1 National Key Laboratory of Human-Machine Hybrid Augmented Intelligence 2 National Engineering Research Center of Visual Information and Applications 3 Institute of Artificial Intelligence and Robotics, Xi an Jiaotong University 4Institute of Automation, Chinese Academy of Sciences 5 Wuhan Al Research EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods and algorithms in paragraph text and uses mathematical formulas, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code https://github.com/Yulong Bonjour/STTM |
| Open Datasets | Yes | Datasets For cross-subject pre-training and evaluation we use the Natural Scenes Dataset (NSD) dataset(Allen et al. 2022), which is currently the largest neural imaging dataset for data-driven brain decoding. Following common practices, our research uses data from 4 subjects(1,2,5,7), who completed all the scan sessions. We used the NSDGeneral ROI mask at 1.8 mm resolution to derive ROIs for the 4 subjects, spanning visual areas from early to higher visual cortex. Corresponding captions were extracted from the COCO dataset. We then conduct transfer learning on the GOD dataset(Horikawa and Kamitani 2017), which has much fewer training samples for each subject and is under a zero-shot setting. Following previous works(Chen et al. 2023; Zeng et al. 2024), we mainly use the data of subject 3 in GOD for comparison. We use preprocessed regions of interest (ROIs)1 that cover voxels from early to higher visual areas. The GOD dataset provides both stimulus-evoked and imagery-induced f MRI data. Corresponding captions can be obtained from the GOD-Cap dataset(Liu et al. 2023). More details can be found in the Appendix. 1Preprocessed data and demo code available at http://brainliner.jp/data/brainliner/Generic Object Decoding |
| Dataset Splits | No | The paper mentions using a "test set" for evaluation (e.g., "982 brain samples in the test set") and refers to |
| Hardware Specification | Yes | Our models are trained and tested on 8 Hygon DCUs with 16GB HBM2 memory. |
| Software Dependencies | No | The paper mentions optimizers like Adam W and a learning rate schedule, but does not specify software libraries (e.g., PyTorch, TensorFlow) or their version numbers. |
| Experiment Setup | Yes | Implementation Details Our models are trained and tested on 8 Hygon DCUs with 16GB HBM2 memory. Using data from 4 NSD subjects, we pre-train the high-level pipeline for 280 epochs and the low-level pipeline for 540 epochs, both with a global batch size of 192. For the high-level pipeline on the GOD dataset, we first train a new subject adapter for 4,500 epochs with a global batch size of 880, while keeping the pre-trained parts frozen. Afterward, we freeze the adapter and MLP backbone, and fine-tune the remaining parts for 400 epochs with a batch size of 600. Similarly, for the low-level pipeline, we train a new subject adapter for 5,000 epochs with a batch size of 192, then freeze the adapter and MLP backbone, and fine-tune the rest for 800 epochs.We optimize using Adam W(Loshchilov and Hutter 2019) with β1 = 0.9, β2 = 0.999, and ϵ = 10 8. We apply the One Cycle learning rate schedule(Smith and Topin 2019) with a maximum learning rate of 0.0005. For reconstruction evaluation metrics, we use Mind Eye s implementation. Further details are available in our code. |