Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Geodesic Optimization for Predictive Shift Adaptation on EEG data
Authors: Apolline Mellot, Antoine Collas, Sylvain Chevallier, Alex Gramfort, Denis A. Engemann
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed empirical benchmarks on the cross-site generalization of age-prediction models with resting-state EEG data from a large multi-national dataset (Har MNq EEG), which included 14 recording sites and more than 1500 human participants. Compared to state-of-the-art methods, our results showed that GOPSA achieved significantly higher performance on three regression metrics (R2, MAE, and Spearman s ρ) for several source-target site combinations, highlighting its effectiveness in tackling multi-source DA with predictive shifts in EEG data analysis. |
| Researcher Affiliation | Collaboration | Apolline Mellot , Antoine Collas Inria, CEA, Université Paris-Saclay Palaiseau, France EMAIL EMAIL Sylvain Chevallier TAU Inria, LISN-CNRS, University Paris-Saclay, France. sylvain.chevallier@ universite-paris-saclay.fr Alexandre Gramfort Inria, CEA, Université Paris-Saclay Palaiseau, France EMAIL Denis A. Engemann Roche Pharma Research and Early Development, Neuroscience and Rare Diseases, Roche Innovation Center Basel, F. Hoffmann La Roche Ltd., Basel, Switzerland. EMAIL |
| Pseudocode | Yes | Algorithm 1: Train-Time GOPSA; Algorithm 2: Test-Time GOPSA |
| Open Source Code | Yes | The dataset Har MNq EEG [ 33 ] is in open access. We provide the code to reproduce the experiments from the raw data. |
| Open Datasets | Yes | The Har MNq EEG dataset [ 33 ] was used for our numerical experiments. This dataset includes EEG recordings collected from 1564 participants across 14 different study sites, distributed across 9 countries. In our analysis, we consider each study site as a distinct domain. |
| Dataset Splits | Yes | For each source-target combination we performed a stratified shuffle split approach with 100 repetitions on the target data. Stratification was based on the recording sites to ensure that each split contained a balanced proportion of participants from each site. The regularization parameter λ in Ridge regression was selected with a nested cross-validation (grid search) over a logarithmic grid of values from 10 1 to 105. To evaluate the benefit of GOPSA, we compared it against four baselines. |
| Hardware Specification | Yes | Experiments with 100 repetitions and all site combinations have been run on a standard Slurm cluster for 12 hours with 250 CPU cores. |
| Software Dependencies | Yes | Numerical computation was enabled by the scientific Python ecosystem: Matplotlib [ 27 ], Scikit-learn [ 42 ], Numpy [ 21 ], Scipy [ 54 ], Py Torch [ 41 ] Py Riemann [ 3 ], MNE [ 19 ] and SKADA [ 18 ]. Specifically, Py Riemann [ 3 ] is cited as "v0.3, July 2022" and SKADA [ 18 ] as "7 2024". |
| Experiment Setup | Yes | The regularization parameter λ in Ridge regression was selected with a nested cross-validation (grid search) over a logarithmic grid of values from 10 1 to 105. In practice, we use L-BFGS and obtain the gradient using automatic differentiation through the Ridge solution that is plugged into the loss in ( 8 ). |