B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding
Authors: Miruna Oprescu, Jacob Dorn, Marah Ghoummaid, Andrew Jesson, Nathan Kallus, Uri Shalit
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Semi-synthetic experimental comparisons validate the theoretical findings, and we use real-world data demonstrate how the method might be used in practice. We evaluate the B-Learner using synthetic and semi-synthetic experiments. In semi-synthetic experiments, we find the B-Learner is at least as effective as existing state-of-art models on a previously proposed benchmark. Finally, we illustrate the use of the B-Learner using real data demonstrate how the method might be used in practice. |
| Researcher Affiliation | Academia | 1Cornell University and Cornell Tech 2Princeton University 3Technion, Israel Institute of Technology 4OATML, University of Oxford. |
| Pseudocode | Yes | Our procedure is summarized in Algorithm 1 (see Appendix E for a detailed version). Appendix E provides 'Algorithm 1 The B-Learner: Detailed'. |
| Open Source Code | Yes | We provide replication code at https://github.com/Causal ML/BLearner. |
| Open Datasets | Yes | We replicate the experiment from Jesson et al. (2021) on IHDP Hidden Confounding. The dataset contains synthetic potential outcomes generated according to the response surface B described by Hill (2011). We use the real-world dataset from Chernozhukov & Hansen (2004) that draws on the 1991 Survey of Income and Program Participation. |
| Dataset Splits | Yes | Each realization is split into training (n = 470), validation (n = 202), and test (n = 75) subsets. |
| Hardware Specification | Yes | The results in Section 8 were obtained using an Amazon Web Services instance with 32 v CPUs and 64 Gi B of RAM. |
| Software Dependencies | No | The paper mentions software packages like 'scikit-learn' and 'Py Torch' but does not specify their version numbers, which is required for reproducibility. |
| Experiment Setup | Yes | Table 2 provides 'Hyperparameters for model choices in synthetic data experiments.' listing: Random Forest (scikit-learn) max depth 6 min samples leaf 0.05; RBF (scikit-learn) length scale 0.9 n 1 4+d; Neural Network (Py Torch) hidden units 100 network depth 4 negative slope 0.3 dropout rate 0.2 batch size 50 learning rate 5e-4. For 401(k) data: 'hyperparameters (n estimators = 100, max depth = 7, max features = 3, min samples leaf = 10)'. |