Causal Effect Inference with Deep Latent-Variable Models

Authors: Christos Louizos, Uri Shalit, Joris M. Mooij, David Sontag, Richard Zemel, Max Welling

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that in the presence of noisy proxies, our method is more robust against hidden confounding, in experiments where we successively add noise to known-confounders. Towards that end we introduce a new causal inference benchmark using data about twin births and mortalities in the USA. We further show that our method is competitive on two existing causal inference benchmarks. Finally, we note that our method does not currently deal with the related problem of selection bias, and we leave this to future work.
Researcher Affiliation Collaboration Christos Louizos University of Amsterdam TNO Intelligent Imaging c.louizos@uva.nl Uri Shalit New York University CIMS uas1@nyu.edu Joris Mooij University of Amsterdam j.m.mooij@uva.nl David Sontag Massachusetts Institute of Technology CSAIL & IMES dsontag@mit.edu Richard Zemel University of Toronto CIFAR zemel@cs.toronto.edu Max Welling University of Amsterdam CIFAR m.welling@uva.nl
Pseudocode No The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a direct link to open-source code for the described methodology or explicitly state that the code is publicly available.
Open Datasets Yes Here we compare with two existing benchmark datasets where there is no need to model proxies, IHDP [21] and Jobs [33], often used for evaluating individual level causal inference. ... We introduce a new benchmark task that utilizes data from twin births in the USA between 1989-1991 [3].
Dataset Splits Yes We follow [25, 48] and use 1000 replications of the simulated outcome, along with the same train/validation/testing splits. ... We further performed early stopping according to the lower bound on a validation set. ... The results after averaging over 10 train/validation/test splits can be seen at Table 2.
Hardware Specification No The paper mentions using "Tensorflow [1] and Edward [52]" but does not specify any hardware details like GPU or CPU models, memory, or other computational resources used for the experiments.
Software Dependencies No For the implementation of our model we used Tensorflow [1] and Edward [52]. While software is mentioned, specific version numbers are not provided, which is necessary for reproducibility.
Experiment Setup Yes For the neural network architecture choices we closely followed [48]; unless otherwise specified we used 3 hidden layers with ELU [11] nonlinearities for the approximate posterior over the latent variables q(Z|X, t, y), the generative model p(X|Z) and the outcome models p(y|t, Z), q(y|t, X). For the treatment models p(t|Z), q(t|X) we used a single hidden layer neural network with ELU nonlinearities. Unless mentioned otherwise, we used a 20-dimensional latent variable z and used a small weight decay term for all of the parameters with λ = .0001. Optimization was done with Adamax [26] and a learning rate of 0.01, which was annealed with an exponential decay schedule. We further performed early stopping according to the lower bound on a validation set.