reproducibilityindex.ai

Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation

Authors: Divyat Mahajan, Ioannis Mitliagkas, Brady Neal, Vasilis Syrgkanis

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct an extensive empirical analysis to benchmark the surrogate model selection metrics introduced in the literature, as well as the novel ones introduced in this work.
Researcher Affiliation	Academia	1 Mila, Universit e de Montr eal 2 Stanford University
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	1The code repository can be accessed here: github.com/divyat09/cate-estimator-selection
Open Datasets	Yes	We work with the ACIC 2016 (Dorie et al., 2019) benchmark, where we discard datasets that have variance in true CATE lower than 0.01 to ensure heterogeneity; which leaves us with 75 datasets from the ACIC 2016 competition. Further, we incorporate three realistic datasets, La Londe PSID, La Londe CPS (La Londe, 1986), and Twins (Louizos et al., 2017), using Real Cause (Neal et al., 2020).
Dataset Splits	Yes	Since surrogate metrics involve approximating the ground-truth CATE ( τ) (E.q. 6), we need to infer the associated nuisance models (ˇη) on the validation set. Further, all nuisance models (ˇη) are trained on the validation set using Auto ML, specifically FLAML (Wang et al., 2021), with a budget of 30 minutes (Figure 3).
Hardware Specification	No	The experiments were enabled in part by computational resources provided by Calcul Qu ebec (calculquebec. ca) and the Digital Research Alliance of Canada (alliancecan.ca).
Software Dependencies	No	We used sklearn for implementing all the regression models and we use the same notation from sklearn for representing the regression model class and the corresponding hyperparameter names.
Experiment Setup	Yes	Ridge Regression; Hyperparameters (α): np.logspace( 4, 5, 10) Kernel Ridge Regression; Hyperparameters (α): np.logspace( 4, 5, 10) Lasso Regression; Hyperparameters (α): np.logspace( 4, 5, 10) Elastic Net Regression; Hyperparameters (α): np.logspace( 4, 5, 10) SVR; Sigmoid Kernel; Hyperparameters (C): np.logspace( 4, 5, 10) SVR; RBF Kernel; Hyperparameters (C): np.logspace( 4, 5, 10) Linear SVR; Hyperparameters (C): np.logspace( 4, 5, 10) Decision Tree: Hyperparameters (max depth): list(range(2, 11)) + [None] Random Forest: Hyperparameters (max depth): list(range(2, 11)) + [None] Gradient Boosting: Hyperparameters (max depth): list(range(2, 11)) + [None]