Generalization Bounds for Causal Regression: Insights, Guarantees and Sensitivity Analysis

Authors: Daniel Csillag, Claudio Jose Struchiner, Guilherme Tegoni Goedert

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we propose a theory based on generalization bounds that provides such guarantees. By introducing a novel change-of-measure inequality, we are able to tightly bound the model loss in terms of the deviation of the treatment propensities over the population, which we show can be empirically limited. Our theory is fully rigorous and holds even in the face of hidden confounding and violations of positivity. We demonstrate our bounds on semi-synthetic and real data, showcasing their remarkable tightness and practical utility.
Researcher Affiliation Academia 1School of Applied Mathematics, Fundac ao Get ulio Vargas, Rio de Janeiro, Brazil.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes More details can be found in Appendix B, and the code is available at https://github.com/dccsillag/ experiments-causal-generalization-bounds.
Open Datasets Yes Learned IHDP: Results of a randomized control trial simulated with generative models trained on the IHDP (Hill, 2011) dataset. ACIC16: Simulated observational data from (Dorie et al., 2017) with fully observed confounding... Parkinson s Telemonitoring dataset of (Tsanas et al., 2009)
Dataset Splits No The paper mentions using "training samples" but does not provide specific details on dataset splits (e.g., percentages, counts, or explicit cross-validation setup) for training, validation, and testing.
Hardware Specification Yes Experiments were run on an AMD Ryzen 9 5950X CPU (2.2GHz/5.0GHz, 32 threads) with 64GB of RAM.
Software Dependencies No The paper mentions software like "Scikit-Learn" and "Geom Loss" but does not specify their version numbers for reproducibility.
Experiment Setup No The paper mentions using "default hyperparameters" for Random Forests but does not provide specific hyperparameter values or detailed training configurations for all models or the general experimental setup.