Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Causal effects of intervening variables in settings with unmeasured confounding

Authors: Lan Wen, Aaron Sarvet, Mats Stensrud

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The theoretical results are applied to data from the National Health and Nutrition Examination Survey. Keywords: Causal inference; Double robustness; Estimands; Frontdoor formula; Intervening variable; Separable eﬀect
Researcher Affiliation	Academia	Lan Wen EMAIL Department of Statistics and Actuarial Science University of Waterloo Waterloo, ON N2L 3G1, Canada Aaron L. Sarvet EMAIL Department of Biostatistics and Epidemiology University of Massachusetts Amherst Amherst, MA 01003, United States Mats J. Stensrud EMAIL Department of Mathematics Ecole Polytechnique F ed erale de Lausanne Lausanne, 1015, Switzerland
Pseudocode	Yes	Algorithm 1 Algorithm for Weighted ICE (generalized frontdoor formula) ... Algorithm 2 Algorithm for Targeted maximum likelihood (generalized frontdoor formula) ... Algorithm 3 Algorithm for Weighted ICE (frontdoor formula) ... Algorithm 4 Algorithm for iterative Targeted maximum likelihood (generalized frontdoor formula) ... Algorithm 5 Algorithm for Weighted ICE (generalized frontdoor formula for discrete exposure with more than two levels)
Open Source Code	No	No explicit statement or link to open-source code for the methodology described in the paper is provided.
Open Datasets	Yes	The theoretical results are applied to data from the National Health and Nutrition Examination Survey. ... We used the dataset from Inoue et al. (2022), which includes observations from the NHANES study linked to a national mortality database (National Death Index).
Dataset Splits	No	The paper mentions using data from the NHANES study but does not specify any training/test/validation splits for experiments. The simulation study describes data-generating mechanisms but not data splitting for model evaluation.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running experiments or simulations.
Software Dependencies	No	The paper mentions using logistic regression models, Super Learner ensemble (library of candidates including generalized additive models and multivariate adaptive regression Splines), and machine learning algorithms, but does not provide specific version numbers for any of these software components or libraries.
Experiment Setup	No	The paper mentions using logistic regression models for outcome, mediator, and exposure in the application section, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or other system-level training settings for these models.