Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Debiased Bayesian inference for average treatment effects
Authors: Kolyan Ray, Botond Szabo
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We numerically illustrate our method on simulated and semi-synthetic data using GP priors, where our prior correction corresponds to a simple data-driven alteration to the covariance kernel. Our experiments demonstrate significant improvement in performance from this debiasing. |
| Researcher Affiliation | Academia | Kolyan Ray Department of Mathematics King s College London EMAIL Botond Szabó Mathematical Institute Leiden University EMAIL |
| Pseudocode | Yes | Algorithm 1 Debiased GP with PS correction |
| Open Source Code | No | The paper mentions using the "GPy package" but does not provide a link to its own source code or state that it is open-source. |
| Open Datasets | Yes | We consider a semi-synthetic dataset with real features and treatment assignments from the Infant Health and Development Program (IHDP), but simulated responses. The IHDP consisted of a randomized experiment studying whether low-birth-weight and premature infants benefited from intensive high-quality child care. The data contains d = 25 pretreatment variables per subject. Following [18] (also used in [2, 21]), an observational study is created by removing a non-random portion of the treatment group, namely all children with non-white mothers. |
| Dataset Splits | No | The paper describes the datasets used (synthetic and IHDP) and their generation, but it does not specify explicit train/validation/test splits (e.g., percentages or counts) or cross-validation details for the models trained and evaluated. It mentions running simulations 200 times but not how data is partitioned within each run for model training and evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions using the "GPy package" and the "scaled conjugate gradient method option in the GPy package" but does not provide version numbers for GPy or any other software dependencies. |
| Experiment Setup | Yes | We optimize the hyperparameters (ℓi)d+1 i=1 , ρm and σn (noise variance) by maximizing the marginal likelihood (using the scaled conjugate gradient method option in the GPy package). We set νn = 0.2ρm/( n Mn) for Mn = n 1 Pn i=1[Ri/ˆπ(Xi) + (1 Ri)/(1 ˆπ(Xi))] the average absolute value of the last part of (5). |