reproducibilityindex.ai

Geodesic Optimization for Predictive Shift Adaptation on EEG data

Authors: Apolline Mellot, Antoine Collas, Sylvain Chevallier, Alex Gramfort, Denis A. Engemann

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We performed empirical benchmarks on the cross-site generalization of age-prediction models with resting-state EEG data from a large multi-national dataset (Har MNq EEG), which included 14 recording sites and more than 1500 human participants. Compared to state-of-the-art methods, our results showed that GOPSA achieved significantly higher performance on three regression metrics (R2, MAE, and Spearman s ρ) for several source-target site combinations, highlighting its effectiveness in tackling multi-source DA with predictive shifts in EEG data analysis.
Researcher Affiliation	Collaboration	Apolline Mellot , Antoine Collas Inria, CEA, Université Paris-Saclay Palaiseau, France apolline.mellot@inria.fr antoine.collas@inria.fr Sylvain Chevallier TAU Inria, LISN-CNRS, University Paris-Saclay, France. sylvain.chevallier@ universite-paris-saclay.fr Alexandre Gramfort Inria, CEA, Université Paris-Saclay Palaiseau, France alexandre.gramfort@inria.fr Denis A. Engemann Roche Pharma Research and Early Development, Neuroscience and Rare Diseases, Roche Innovation Center Basel, F. Hoffmann La Roche Ltd., Basel, Switzerland. denis.engemann@roche.com
Pseudocode	Yes	Algorithm 1: Train-Time GOPSA; Algorithm 2: Test-Time GOPSA
Open Source Code	Yes	The dataset Har MNq EEG [ 33 ] is in open access. We provide the code to reproduce the experiments from the raw data.
Open Datasets	Yes	The Har MNq EEG dataset [ 33 ] was used for our numerical experiments. This dataset includes EEG recordings collected from 1564 participants across 14 different study sites, distributed across 9 countries. In our analysis, we consider each study site as a distinct domain.
Dataset Splits	Yes	For each source-target combination we performed a stratified shuffle split approach with 100 repetitions on the target data. Stratification was based on the recording sites to ensure that each split contained a balanced proportion of participants from each site. The regularization parameter λ in Ridge regression was selected with a nested cross-validation (grid search) over a logarithmic grid of values from 10 1 to 105. To evaluate the benefit of GOPSA, we compared it against four baselines.
Hardware Specification	Yes	Experiments with 100 repetitions and all site combinations have been run on a standard Slurm cluster for 12 hours with 250 CPU cores.
Software Dependencies	Yes	Numerical computation was enabled by the scientific Python ecosystem: Matplotlib [ 27 ], Scikit-learn [ 42 ], Numpy [ 21 ], Scipy [ 54 ], Py Torch [ 41 ] Py Riemann [ 3 ], MNE [ 19 ] and SKADA [ 18 ]. Specifically, Py Riemann [ 3 ] is cited as "v0.3, July 2022" and SKADA [ 18 ] as "7 2024".
Experiment Setup	Yes	The regularization parameter λ in Ridge regression was selected with a nested cross-validation (grid search) over a logarithmic grid of values from 10 1 to 105. In practice, we use L-BFGS and obtain the gradient using automatic differentiation through the Ridge solution that is plugged into the loss in ( 8 ).