Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Causal effects of intervening variables in settings with unmeasured confounding

Authors: Lan Wen, Aaron Sarvet, Mats Stensrud

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The theoretical results are applied to data from the National Health and Nutrition Examination Survey. Keywords: Causal inference; Double robustness; Estimands; Frontdoor formula; Intervening variable; Separable effect
Researcher Affiliation Academia Lan Wen EMAIL Department of Statistics and Actuarial Science University of Waterloo Waterloo, ON N2L 3G1, Canada Aaron L. Sarvet EMAIL Department of Biostatistics and Epidemiology University of Massachusetts Amherst Amherst, MA 01003, United States Mats J. Stensrud EMAIL Department of Mathematics Ecole Polytechnique F ed erale de Lausanne Lausanne, 1015, Switzerland
Pseudocode Yes Algorithm 1 Algorithm for Weighted ICE (generalized frontdoor formula) ... Algorithm 2 Algorithm for Targeted maximum likelihood (generalized frontdoor formula) ... Algorithm 3 Algorithm for Weighted ICE (frontdoor formula) ... Algorithm 4 Algorithm for iterative Targeted maximum likelihood (generalized frontdoor formula) ... Algorithm 5 Algorithm for Weighted ICE (generalized frontdoor formula for discrete exposure with more than two levels)
Open Source Code No No explicit statement or link to open-source code for the methodology described in the paper is provided.
Open Datasets Yes The theoretical results are applied to data from the National Health and Nutrition Examination Survey. ... We used the dataset from Inoue et al. (2022), which includes observations from the NHANES study linked to a national mortality database (National Death Index).
Dataset Splits No The paper mentions using data from the NHANES study but does not specify any training/test/validation splits for experiments. The simulation study describes data-generating mechanisms but not data splitting for model evaluation.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory used for running experiments or simulations.
Software Dependencies No The paper mentions using logistic regression models, Super Learner ensemble (library of candidates including generalized additive models and multivariate adaptive regression Splines), and machine learning algorithms, but does not provide specific version numbers for any of these software components or libraries.
Experiment Setup No The paper mentions using logistic regression models for outcome, mediator, and exposure in the application section, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or other system-level training settings for these models.