Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Removing Hidden Confounding by Experimental Grounding
Authors: Nathan Kallus, Aahlad Manas Puli, Uri Shalit
NeurIPS 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run experiments on both simulation and real-world data and show our method outperforms the standard approaches to this problem. 5 Experiments In order to illustrate the validity and usefulness of our proposed method we conduct simulation experiments and experiments with real-world data taken from the Tennessee STAR study: a large long-term school study where students were randomized to different types of classes [WJB+90, Kru99]. |
| Researcher Affiliation | Academia | Nathan Kallus Cornell University and Cornell Tech New York, NY EMAIL Aahlad Manas Puli New York University New York, NY EMAIL Uri Shalit Technion Haifa, Israel EMAIL |
| Pseudocode | Yes | Algorithm 1 Remove hidden confounding with unconfounded sample |
| Open Source Code | No | The paper does not provide any specific links to source code or explicit statements about code release. |
| Open Datasets | Yes | We approach this challenge by using data from a randomized controlled trial, the Tennessee STAR study [WJB+90, Kru99, MISN18]. |
| Dataset Splits | Yes | We split the entire dataset (ALL) into a small unconfounded subset (UNC), and a larger, confounded subset (CONF) over a somewhat different population. ... We then evaluate how well the CATE predictions match Y GT i on a held-out sample from ALL \ UNC (the set ALL minus the set UNC), in terms of RMSE. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Random Forest' and 'Ridge Regression' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The regression methods we use in (i)-(iii) are Random Forest with 200 trees and Ridge Regression with cross-validation. |