Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Doubly Robust Inference for Double Machine Learning in Semiparametric Regression

Authors: Oliver Dukes, Stijn Vansteelandt, David Whitney

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In order to judge how well the methods are expected to perform in practice, we conducted three simulation experiments.
Researcher Affiliation	Collaboration	Oliver Dukes EMAIL Department of Applied Mathematics, Computer Science and Statistics Ghent University 9000 Ghent, Belgium Stijn Vansteelandt EMAIL Department of Applied Mathematics, Computer Science and Statistics Ghent University 9000 Ghent, Belgium David Whitney EMAIL GSK Gunnels Wood Road, Stevenage, SG1 2NY, U.K.
Pseudocode	Yes	We describe how this can be done below: 1. Divide the sample into disjoint parts Ik each of size nk = n/K, where K is a ﬁxed integer (and assuming n is a multiple of K). For each Ik, let Ic k denote all indices that are not in Ik. 2. Obtain the machine learning estimates ˆgc k(L) and ˆmc k(L) from Ic k. 3. Obtain the estimates ˆGk(L) and ˆ Mk(L) from Ik. 4. Obtain the estimates ˆαk and ˆβk via solving the equations: ... 5. For all i in Ik, obtain the score ... 6. Construct a test statistic...
Open Source Code	No	The paper mentions using the 'hdm package in R' but does not provide any statement or link to source code specifically for the methodology described in this paper by the authors. The text states: 'we implemented using the hdm package in R (Chernozhukov et al., 2016).'
Open Datasets	No	The paper describes how data was generated for simulation studies, rather than utilizing pre-existing public datasets. For Experiment 1, it states: 'The first covariate L1 was generated from a U( 2, 2) distribution, whilst the second covariate L2 and exposure were both binary with respective expectations 0.5 and expit{ L1 + 2L1L2}. The outcome Y was simulated from a N( L1 + 2L1L2, 1) distribution.'
Dataset Splits	Yes	Five-fold cross-ﬁtting was used in the construction of each of the tests. 1. Divide the sample into disjoint parts Ik each of size nk = n/K, where K is a ﬁxed integer (and assuming n is a multiple of K). For each Ik, let Ic k denote all indices that are not in Ik.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper mentions the use of 'the hdm package in R (Chernozhukov et al., 2016)' but does not specify version numbers for R or the hdm package itself, which is required for reproducibility.
Experiment Setup	Yes	The parameter that was consistently estimated was obtained using the Super Learner (van der Laan et al., 2007), whilst the inconsistently estimated parameter was obtained via ℓ1 penalised maximum likelihood with an omitted interaction term. Tuning parameters were selected using cross-validation... The parameters ζγ and ζβ were ﬁrst both ﬁxed at 0.82; we then considered a more challenging setting by lowering to ζβ = 0.2... We also reversed this, setting ζβ = 0.82 and ζγ = 0.2.