Learning Representations for Counterfactual Inference
Authors: Fredrik Johansson, Uri Shalit, David Sontag
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform an empirical comparison with previous approaches to causal inference from observational data. Our deep learning algorithm significantly outperforms the previous state-of-the-art. |
| Researcher Affiliation | Academia | Fredrik D. Johansson FREJOHK@CHALMERS.SE CSE, Chalmers University of Technology, G oteborg, SE-412 96, Sweden Uri Shalit SHALIT@CS.NYU.EDU David Sontag DSONTAG@CS.NYU.EDU CIMS, New York University, 251 Mercer Street, New York, NY 10012 USA |
| Pseudocode | Yes | Algorithm 1 Balancing counterfactual regression |
| Open Source Code | No | The paper does not provide a direct link or explicit statement that the authors' source code for the described methodology is publicly available. It mentions using and referencing other packages like 'NPCI package (Dorie, 2016)' and 'Bayes Tree R-package (Chipman & Mc Culloch, 2016)' but not their own code. |
| Open Datasets | Yes | Hill (2011) introduced a semi-simulated dataset based on the Infant Health and Development Program (IHDP). ... based on 50 LDA topics, trained on documents from the NY Times corpus (downloaded from UCI (Newman, 2008)). |
| Dataset Splits | Yes | Standard methods for hyperparameter selection, including cross-validation, are unavailable when training counterfactual models on real-world data, as there are no samples from the counterfactual outcome. In our experiments, all outcomes are simulated, and we have access to counterfactual samples. To avoid fitting parameters to the test set, we generate multiple repeated experiments, each with a different outcome function and pick hyperparameters once, for all models (and baselines), based on a held-out set of experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. It only mentions that 'The neural network architectures used for all experiments consist of fully-connected Re LU layers trained using RMSProp...' without hardware context. |
| Software Dependencies | No | The paper mentions 'RMSProp' and the 'Bayes Tree R-package (Chipman & Mc Culloch, 2016)' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | The neural network architectures used for all experiments consist of fully-connected Re LU layers trained using RMSProp, with a small l2 weight decay, λ = 10 3. ... For the IHDP data we use layers of 25 hidden units each. For the News data representation layers have 400 units and output layers 200 units. |