Alleviating the Semantic Gap for Generalized fMRI-to-Image Reconstruction
Authors: Tao Fang, Qian Zheng, Gang Pan
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that the proposed GESS model outperforms state-of-the-art methods, and we propose a generalized scenario split strategy to evaluate the advantage of GESS in closing the semantic gap. |
| Researcher Affiliation | Academia | 1College of Computer Science and Technology, Zhejiang University, Hangzhou, China 2The State Key Lab of Brain-Machine Intelligence, Zhejiang University, Hangzhou, China |
| Pseudocode | Yes | Algorithm 1 The pseudo code of GESS. |
| Open Source Code | Yes | Our codes are available at https://github.com/duolala1/GESS. |
| Open Datasets | Yes | We evaluated the performance of our model using two datasets: the General Object Decoding (GOD) dataset [19] and the Natural Scenes Dataset (NSD)[1]. The GOD dataset contains 1200 images from 150 categories for training and 50 images from 50 categories for testing. NSD uses images from the COCO dataset and roughly 10,000 f MRI-image pairs for one subject. |
| Dataset Splits | Yes | The GOD dataset contains 1200 images from 150 categories for training and 50 images from 50 categories for testing...We used 1200 and 50 samples as the training and testing samples, respectively. |
| Hardware Specification | No | The paper mentions 'computational cost' in the limitations, implying hardware use, but it does not specify any exact GPU models (e.g., NVIDIA A100), CPU types, or other specific hardware configurations used for running experiments. |
| Software Dependencies | No | The paper mentions several software components like 'CLIP model', 'VQGAN', 'LDM', 'DDIM', and 'Cycle GAN', but it does not provide specific version numbers for any of these software dependencies or libraries. |
| Experiment Setup | Yes | We use ridge regression trained with λc = 1000. We slightly smooth the images with a Gaussian kernel (r = 5) before extracting semantic features. images are pre-smoothed with a Gaussian kernel (r = 15)...We use DDIM [26] acceleration during sampling with 50 time steps. |