Causal Isotonic Calibration for Heterogeneous Treatment Effects

Authors: Lars Van Der Laan, Ernesto Ulloa-Perez, Marco Carone, Alex Luedtke

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 5, we examine the performance of our method in simulations.
Researcher Affiliation Academia 1Department of Statistics, University of Washington, USA 2Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, USA 3Department of Biostatistics, University of Washington, USA.
Pseudocode Yes Algorithm 1 Causal isotonic calibration; Algorithm 2 Causal isotonic cross-calibration (unpooled); Algorithm 3 Causal isotonic cross-calibration (pooled).
Open Source Code Yes R code implementing causal isotonic calibration with user-supplied (cross-fitted) nuisance estimates and predictions is provided in the Github package causal Calibration and can be found at https://github.com/Larsvanderlaan/causal Calibration.
Open Datasets No In simulation studies, data units were generated as follows for the two scenarios considered.
Dataset Splits No The paper describes data splitting for the calibration process (e.g., 'sample splitting involves randomly partitioning Dn into Em Cℓ') and the use of 'cross-validation' for model selection within the Super Learner. However, it does not explicitly provide the standard training, validation, and test dataset splits for the overall experimental evaluation from a pre-existing dataset, as the data was generated synthetically.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., CPU/GPU models, memory, or cloud instance types).
Software Dependencies Yes We used the implementation of these estimators found in R package sl3 (Coyle et al., 2021)... R package version 1.4.2. ... Finally, we used the R function isoreg to performed the isotonic regression step.
Experiment Setup Yes In Scenario 1, to estimate the CATE, we implemented gradient-boosted regression trees (GBRT) with maximum depths equal to 2, 5, and 8 (Chen & Guestrin, 2016), random forests (RF) (Breiman, 2001), generalized linear models with lasso regularization (GLMnet) (Friedman et al., 2010), generalized additive models (GAM) (Wood, 2017), and multivariate adaptive regression splines (MARS) (Friedman, 1991). In Scenario 2, we implemented RF, GLMnet, and a combination of variable screening with lasso regularization followed by GBRT with maximum depth determined via cross-validation. ... Additionally, for numerical stability, we imposed a threshold on the estimated propensity scores such that it took values between 0.01 and 0.99.