Removing Hidden Confounding by Experimental Grounding
Authors: Nathan Kallus, Aahlad Manas Puli, Uri Shalit
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run experiments on both simulation and real-world data and show our method outperforms the standard approaches to this problem. 5 Experiments In order to illustrate the validity and usefulness of our proposed method we conduct simulation experiments and experiments with real-world data taken from the Tennessee STAR study: a large long-term school study where students were randomized to different types of classes [WJB+90, Kru99]. |
| Researcher Affiliation | Academia | Nathan Kallus Cornell University and Cornell Tech New York, NY kallus@cornell.edu Aahlad Manas Puli New York University New York, NY apm470@nyu.edu Uri Shalit Technion Haifa, Israel urishalit@technion.ac.il |
| Pseudocode | Yes | Algorithm 1 Remove hidden confounding with unconfounded sample |
| Open Source Code | No | The paper does not provide any specific links to source code or explicit statements about code release. |
| Open Datasets | Yes | We approach this challenge by using data from a randomized controlled trial, the Tennessee STAR study [WJB+90, Kru99, MISN18]. |
| Dataset Splits | Yes | We split the entire dataset (ALL) into a small unconfounded subset (UNC), and a larger, confounded subset (CONF) over a somewhat different population. ... We then evaluate how well the CATE predictions match Y GT i on a held-out sample from ALL \ UNC (the set ALL minus the set UNC), in terms of RMSE. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Random Forest' and 'Ridge Regression' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The regression methods we use in (i)-(iii) are Random Forest with 200 trees and Ridge Regression with cross-validation. |