Self-Supervised Learning of Brain Dynamics from Broad Neuroimaging Data
Authors: Armin Thomas, Christopher Ré, Russell Poldrack
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the frameworks by pre-training models on a broad neuroimaging dataset spanning functional Magnetic Resonance Imaging data from 11, 980 experimental runs of 1, 726 individuals across 34 datasets, and subsequently adapting the pre-trained models to benchmark mental state decoding datasets. The pre-trained models transfer well, generally outperforming baseline models trained from scratch, while models trained in a learning framework based on causal language modeling clearly outperform the others. |
| Researcher Affiliation | Academia | Armin W. Thomas Department of Psychology Stanford University athms@stanford.edu Christopher Ré Department of Computer Science Stanford University chrismre@stanford.edu Russell A. Poldrack Department of Psychology Stanford University poldrack@stanford.edu |
| Pseudocode | No | The paper describes algorithms verbally and with diagrams (Figure 1) but does not include formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | To enable others to build on this work, we make our code, training data, and pre-trained models publicly available 1. 1 github.com/athms/learning-from-brains |
| Open Datasets | Yes | All unprocessed f MRI data used in this study are publicly available through Open Neuro.org [3] and the Human Connectome Project (HCP [33]). |
| Dataset Splits | Yes | We split the upstream data into distinct training and evaluation datasets by randomly designating 5% of the f MRI runs of each included f MRI dataset as evaluation data (at a minimum of 2 runs per dataset) and using the rest of the runs for training. At each evaluation step, we randomly sample 640, 000 sequences from the evaluation dataset. ... Specifically, for each evaluated dataset size, we first randomly selected 10 (HCP) and 3 (MDTB) individuals whose data we use for validation, and 20 (HCP) and 9 (MDTB) other individuals whose data we use for testing, before randomly sampling the given number of individuals whose data we use for training (1 to 48 (HCP) and 11 (MDTB)) from the remaining pool of individuals. |
| Hardware Specification | Yes | Upstream training was performed on Google Compute Engine n1-highmem64 nodes with four Nvidia Tesla P100 GPUs, 64 CPU threads, and 416 GB RAM memory, while downstream adaptations were performed on compute nodes of the Texas Advanced Computing Center with one Nvidia 1080-TI GPU, 32 CPU threads, and 128 GB RAM memory. |
| Software Dependencies | No | The paper mentions software like ADAM optimizer and Hugging Face's Transformer library, but does not provide specific version numbers for these or other key software dependencies. |
| Experiment Setup | Yes | We train all models with stochastic gradient descent and the ADAM optimizer (with β1 = 0.9, β2 = 0.999, and = 1e 8) [32], if not reported otherwise. We also apply a linear learning rate decay schedule (with a warmup phase of 1% of the total number of training steps), gradient norm clipping at 1.0, and L2-regularisation (weighted by 0.1). ... During upstream learning, we set the maximum learning rates to 2e 4, 5e 4, 1e 4, and 1e 4 for the autoencoding, CSM, Sequence-BERT, and Network-BERT frameworks, respectively... and randomly sample sequences X of 10 to 55 TRs from our upstream f MRI runs. ... We trained one model for each point of the resulting 9-point grid of each framework for 200, 000 training steps, using a mini-batch size of 256 sequences. |