Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

BrainEC-LLM: Brain Effective Connectivity Estimation by Multiscale Mixing LLM

Authors: Wen Xiong, Junzhong Ji, Jinduo Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results on simulated and real resting-state f MRI datasets demonstrate that Brain EC-LLM can achieve superior performance when compared to state-of-the-art baselines. The code is available at https: //github.com/Xiong Wen Xww/Brain EC-LLM.
Researcher Affiliation	Academia	Wen Xiong, Junzhong Ji, Jinduo Liu Beijing University of Technology EMAIL, EMAIL
Pseudocode	Yes	The algorithm description and pseudocode can be discovered in Appendix C.
Open Source Code	Yes	The code is available at https: //github.com/Xiong Wen Xww/Brain EC-LLM.
Open Datasets	Yes	Simulated f MRI Dataset. The benchmark simulated datasets we use are Smith dataset [58] and Sanchez dataset [54] both generated by dynamic causal model. Specifically, Sanchez dataset offers higher temporal resolution and acquisition frequency compared to Smith dataset and minimally affects the non-Gaussianity of the BOLD signal, a configuration commonly observed in real brain networks [45]. Furthermore, we generate a simulated dataset using CDRL [79]. ... Real Resting-state f MRI Dataset. To assess the performance of methods under real BOLD data conditions, we utilize high-resolution 7T human resting-state f MRI data [56] from medial temporal lobe.
Dataset Splits	No	Given the limited availability of f MRI data, segmentation would further reduce the dataset size and potentially compromise the reliability of results. Moreover, since EC lacks ground truth on real f MRI datasets, we opt not to segment the dataset but instead directly train model on the available f MRI data in an autoregressive manner, obtaining both the reconstructed f MRI data and brain EC. This approach leverages the inherent autocorrelation (i.e., the brain effective connectivity network) within the f MRI time series to predict the f MRI data itself, enabling the model to learn more representative and informative connectivity structures. As a result, the model is trained by minimizing the difference between the predicted and actual f MRI signals. The EC network is generated as the outcome of this optimization process, and evaluation metrics are directly computed based on the predicted EC network without the need for a separate test set.
Hardware Specification	Yes	All experiments are conducted on a single Nvidia L20-48GB GPU.
Software Dependencies	No	We use Llama3-8B [16] as the default backbone model unless otherwise specified. ... Additionally, the results using Llama 3 as the backbone are manifestly outperforms Mistral.
Experiment Setup	Yes	We train the model unsupervisedly in an autoregressive manner. Detailed training methods and hyperparameter settings can be found in Appendix D.6 and Appendix D.7, respectively.