Efficient Policy Evaluation Across Multiple Different Experimental Datasets

Authors: Yonghan Jung, Alexis Bellot

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically verified the robustness of estimators through simulations. In this section, we demonstrate the proposed estimators in Defs. (5,7) for combining multiple experimental datasets from different domains. We first compared the estimators on synthetic data to provide evidence of the fast convergence and doubly robustness behaviours of the proposed estimators. We conclude with an analysis of the ACTG 175 clinical trial [21] and Project STAR.
Researcher Affiliation Collaboration Yonghan Jung Purdue University jung222@purdue.edu Alexis Bellot Independent Researcher abellot@gmail.com :Now at Google Deep Mind.
Pseudocode Yes Definition 5 (DML for combining two experiments). Let D2 P 2 π2p Vq, D1 P 1 π1p Vq and D0 P 0p Cq. Let L ě 2 denote a fixed number. 1. Sample split: For ℓ 1, , L, randomly split Di for i P t0, 1, 2u into L-fold. The ℓ th partition of the sample is denoted Di ℓ. The complement is Di ℓ: Diz Di ℓ. 2. Nuisance estimation: For each ℓ 1, , L, learn the estimator model ˆµ2 ℓand ˆµ1 ℓfor µ2 0, µ1 0 using samples D2 ℓ, D1 ℓ, respectively. Also, learn the estimation model for ˆω1 ℓ, ˆω2 ℓfor ω1 0, ω2 0 using samples Di ℓfor i 0, 1, 2, respectively. 3. Evaluation: The DML estimator ˆψ for EP 0 π0r Y s is then given as
Open Source Code Yes Codes corresponding to simulations are submitted as supplementary materials. [NeurIPS Checklist Q5 Justification]: The code will not be open sourced at this moment but we believe to have provided sufficient details to reproduce our results.
Open Datasets Yes We conclude with an analysis of the ACTG 175 clinical trial [21] and Project STAR. The dataset is publicly accessible from the R data repository: https://search.r-project.org/CRAN/refmans/AER/html/STAR.html.
Dataset Splits Yes Definition 5 (DML for combining two experiments). 1. Sample split: For ℓ 1, , L, randomly split Di for i P t0, 1, 2u into L-fold. The ℓ th partition of the sample is denoted Di ℓ. The complement is Di ℓ: Diz Di ℓ.
Hardware Specification No The paper does not provide specific details about the hardware used, such as CPU or GPU models, or memory specifications.
Software Dependencies No The paper mentions using "XGBoost [12] to estimate nuisances" but does not specify a version number for XGBoost or any other software dependencies.
Experiment Setup Yes We ran 100 simulations for each N t2500, 5000, 10000, 20000u where N is the sample size. To enforce the convergence rate of nuisance estimates no faster than the decaying rate n 1{4, we add ϵ to all nuisance estimates. This scenario is inspired by the experimental design discussed in [27]. The AE plots for combining two/multiple experiments are presented in Figs. (3a, 3b).