Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

DCA: Graph-Guided Deep Embedding Clustering for Brain Atlases

Authors: Mo WANG, Kaining Peng, Jingsheng Tang, Hongkai Wen, Quanying Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present Deep Cluster Atlas (DCA), a graph-guided deep embedding clustering framework for generating individualized, voxel-wise brain parcellations. DCA combines a pretrained voxel-level f MRI autoencoder with spatially regularized deep clustering to produce functionally coherent and spatially contiguous regions. Our method supports flexible control over resolution and anatomical scope, and generalizes to arbitrary brain structures. We further introduce a standardized benchmarking platform for atlas evaluation, using multiple large-scale f MRI datasets. Across multiple datasets and scales, DCA outperforms state-of-the-art atlases, improving functional homogeneity by 98.8% and silhouette coefficient by 29%, and achieves superior performance in downstream tasks.
Researcher Affiliation	Academia	Mo Wang SUSTech & University of Warwick Kaining Peng SUSTech Jingsheng Tang SUSTech Hongkai Wen University of Warwick EMAIL Quanying Liu SUSTech EMAIL
Pseudocode	Yes	Algorithm 1 Group-level atlas generation from individual parcellations
Open Source Code	Yes	Codes are available at https://github.com/ncclab-sustech/DCA.
Open Datasets	Yes	We use resting-state f MRI data from 1000 subjects in the Human Connectome Project (HCP) [29] for atlas construction. All data were processed using the HCP minimal preprocessing pipeline [30]... For downstream evaluation, we use three public datasets: HCP, ABIDE [32], and ADNI [33].
Dataset Splits	Yes	Both evaluations used a 70 / 10 / 20 subject split for train/validation/test, fully disjoint from fine-tuning subjects to avoid leakage.
Hardware Specification	Yes	The model is trained for 8 epochs on 2 NVIDIA A100 GPUs using a batch size of 4.
Software Dependencies	No	The optimizer and learning rate are Adam and 0.01 respectively for both pretraining and fine-tuning.
Experiment Setup	Yes	We pretrain our model using a masked reconstruction objective on f MRI data blocks of size 96 96 96 300, representing 3D spatial volumes with 300 temporal frames. The model is trained for 8 epochs on 2 NVIDIA A100 GPUs using a batch size of 4. The optimizer and learning rate are Adam and 0.01 respectively for both pretraining and fine-tuning. During pretraining, we adopt a masking ratio of 0.8, randomly masking 80% of the input in both spatial and temporal dimensions.