reproducibilityindex.ai

Neural encoding with visual attention

Authors: Meenakshi Khosla, Gia Ngo, Keith Jamison, Amy Kuceyeski, Mert Sabuncu

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using concurrent eye-tracking and functional Magnetic Resonance Imaging (f MRI) recordings from a large cohort of human subjects watching movies, we ﬁrst demonstrate that leveraging gaze information, in the form of attentional masking, can signiﬁcantly improve brain response prediction accuracy in a neural encoding model. Next, we propose a novel approach to neural encoding by including a trainable soft-attention module. Using our new approach, we demonstrate that it is possible to learn visual attention policies by end-to-end learning merely on f MRI response data, and without relying on any eye-tracking. Interestingly, we ﬁnd that attention locations estimated by the model on independent data agree well with the corresponding eye ﬁxation patterns, despite no explicit supervision to do so.
Researcher Affiliation	Academia	1 School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853 2 Nancy E. and Peter C. Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY 14853 3 Radiology, Weill Cornell Medicine, New York, NY 10065 4 Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY 10065
Pseudocode	No	The paper describes the architecture and operations using equations and descriptive text (e.g., "A = exp S(i) Pn j=1 exp S(j) , i {1, .., n}."), but it does not present any structured pseudocode or algorithm blocks.
Open Source Code	Yes	1Our code is available at https://github.com/mk2299/encoding_attention.
Open Datasets	Yes	We study high-resolution 7T f MRI (TR = 1s, voxel size = 1.6 mm isotropic) recordings of 158 participants from the Human Connectome Project (HCP) movie-watching database while they viewed 4 audio-visual movies in separate runs [13, 26].
Dataset Splits	Yes	We train and validate our models on three movies using a 9:1 train-val split and leave the fourth movie for independent testing. This yields 2000 training, 265 validation and 699 test stimulus-response pairs.
Hardware Specification	No	The paper mentions "7T f MRI" and computational aspects like "computational/memory constraints" but does not specify any particular GPU or CPU models, or other hardware used for running the experiments.
Software Dependencies	No	The paper mentions using Adam optimizer and ResNet-50 architecture, but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	All parameters were optimized to minimize the mean squared error between the predicted and target f MRI response using Adam [18] for 25 epochs with a learning rate of 1e-4. Validation curves were monitored to ensure convergence and hyperparameters were optimized on the validation set.