Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Disentangling syntax and semantics in the brain with deep networks
Authors: Charlotte Caucheteux, Alexandre Gramfort, Jean-Remi King
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Functional MRI dataset. We analyze the Narratives public dataset (Nastase et al., 2020), which contains the f MRI measurements of 345 unique subjects listening to narratives. |
| Researcher Affiliation | Collaboration | 1Inria, Saclay, France 2Facebook AI Research, Paris, France 3 Ecole normale sup erieure, PSL University, CNRS, Paris, France. |
| Pseudocode | No | The paper describes methods through text and figures (e.g., Figure 2 for method to isolate syntactic representations) but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link to a source code repository or an explicit statement about the release of their own source code for the methodology described. |
| Open Datasets | Yes | Functional MRI dataset. We analyze the Narratives public dataset (Nastase et al., 2020), which contains the f MRI measurements of 345 unique subjects listening to narratives. |
| Dataset Splits | Yes | f g was fitted on Itrain = 99% of the dataset, and evaluated on Itest = 1% of the left out-data (2.5 min of audio). ...We repeat the procedure 100 times with a 100-fold cross-validation, using scikit-learn KFold without shuffling (Pedregosa et al., 2011). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions several software packages used, such as 'spa Cy', 'Supar', 'Gector', 'scikit-learn', and 'MNE-Python', but it does not specify their version numbers, which is required for reproducibility. |
| Experiment Setup | Yes | We use the linear ridge regression from scikit-learn (Pedregosa et al., 2011), with penalization parameters chosen among 10 values log-spaced between 10 1 and 108 and g was a finite impulse response (FIR) model with 5 delays, following (Huth et al., 2016)." and "To isolate the syntactic representations of GPT-2 , we synthesize, for each sentence of each story, k = 10 sentences with the same syntactic structures (Figure 2). |