Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Robust Recursive Partitioning for Heterogeneous Treatment Effects with Uncertainty Quantification
Authors: Hyun-Suk Lee, Yao Zhang, William Zame, Cong Shen, Jang-Won Lee, Mihaela van der Schaar
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments using synthetic and semi-synthetic datasets (based on real-world data) demonstrate that R2P outperforms state-of-the-art methods by more robustly identifying subgroups while providing much narrower confidence intervals. |
| Researcher Affiliation | Academia | Hyun-Suk Lee Sejong University EMAIL Yao Zhang University of Cambridge EMAIL William R. Zame UCLA EMAIL Cong Shen University of Virginia EMAIL Jang-Won Lee Yonsei University EMAIL Mihaela van der Schaar University of Cambridge UCLA The Alan Turing Institute EMAIL |
| Pseudocode | Yes | Algorithm 1 Robust Recursive Partitioning |
| Open Source Code | Yes | The code of R2P is available at: https://bitbucket.org/mvdschaar/mlforhealthlabpub. |
| Open Datasets | Yes | The two semi-synthetic datasets are based on real world data; the first uses the Infant Health and Development Program (IHDP) dataset [24] and the second uses the Collaborative Perinatal Project (CPP) dataset [25]. |
| Dataset Splits | No | The paper describes internal splitting for the SCR method (training set I1 and validation set I2) and for subgroup partitioning, but does not provide specific train/validation/test dataset splits for the overall experimental evaluation on the named datasets. |
| Hardware Specification | No | The paper does not provide any specific hardware specifications (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components and methods like 'causal multi-task Gaussian process (CMGP)', 'Random Forest', 'multi-task Gaussian processes', and 'deep neural networks', but does not provide specific version numbers for any software or libraries used in the experiments. |
| Experiment Setup | Yes | We set the miscoverage rate to be α = 0.05, so we demand a 95% ITE coverage rate. |