reproducibilityindex.ai

Causal Inference using Gaussian Processes with Structured Latent Confounders

Authors: Sam Witty, Kenta Takatsu, David Jensen, Vikash Mansinghka

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the GP-SLC model using three benchmarks with known counterfactual outcomes. In Section 6.1, we evaluate GP-SLC using a fully synthetic hierarchical data generating process. In Section 6.2 we modify the Infant Health and Development Program (IHDP) benchmark (Hill, 2011) to include hierarchical structure and latent confounders. In Section 6.3 we introduce and evaluate on a new benchmark task for observational causal inference with hierarchical data, predicting the effect of changing temperatures on state-wide electric energy consumption in New England (NEEC).
Researcher Affiliation	Academia	1College of Information and Computer Sciences, University of Massachusetts, Amherst, United States 2Massachusetts Institute of Technology, Cambridge, United States. Correspondence to: Sam Witty <switty@cs.umass.edu>.
Pseudocode	Yes	Algorithm 1 Individual Treatment Effect Estimation, Algorithm 2 Hyperparameter Update Random Walk MH, Algorithm 3 Confounder Update Elliptical Slice Sampling
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code of the GP-SLC model.
Open Datasets	Yes	In Section 6.2 we modify the Infant Health and Development Program (IHDP) benchmark (Hill, 2011) to include hierarchical structure and latent confounders... We introduce a new benchmark for estimating heterogenous effects in hierarchically structured settings, predicting the effect of changing temperature on state-wide electric energy consumption in New England. ... Specifically, we generate data for the NEEC benchmark task using the New England Independent Service Operator s public records on hourly dry-bulb temperature and state-wide energy consumption for the 2018 calendar year (ISO New England, 2018)
Dataset Splits	No	The paper does not explicitly provide details about training, validation, and test dataset splits needed for reproduction. It mentions using 'benchmarks' and evaluating on 'held-out test data' for other models, but specific splits for GP-SLC are not defined.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper mentions using 'Gen (Cusumano Towner et al., 2019)', a probabilistic programming language, but does not specify its version number or any other software dependencies with specific versions.
Experiment Setup	Yes	Except where otherwise speciﬁed we set NU = 3 and αθ = βθ = 4 for each inverse gamma prior over kernel hyperparameters and exogenous noise variance. We estimate individual treatment effects using Algorithm 1, with NOuter = 5000, NMH = 3, NES = 5, and driftθ = 0.5, θ Θ.