Not all distributional shifts are equal: Fine-grained robust conformal inference

Authors: Jiahao Ai, Zhimei Ren

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate the proposed methods WRCP and D-WRCP under a variety of simulation settings. In this section, we present several representative settings and leave the other results to Appendix E. We evaluate the performance of all methods on four real datasets: the national study of learning mindsets dataset (Carvalho et al., 2019), the ACS income dataset (Ding et al., 2021), the covid information study datasets (Pennycook et al., 2020; Roozenbeek et al., 2021), and the poverty mapping dataset (Yeh et al., 2020; Koh et al., 2021).
Researcher Affiliation Academia 1School of Mathematical Sciences, Peking University, Beijing, China 2Department of Statistics and Data Science, University of Pennsylvania, Philadelphia, USA.
Pseudocode Yes Algorithm 1 Weighted robust conformal prediction (WRCP) ... Algorithm 2 Debiased weighted robust conformal prediction (D-WRCP) ... Algorithm 3 Conformalized counterfactual inference under the f-sensitivity model
Open Source Code Yes All the numerical results in this paper can be reproduced with the code available at https://github.com/zhimeir/finegrained-conformal-paper.
Open Datasets Yes We evaluate the performance of all methods on four real datasets: the national study of learning mindsets dataset (Carvalho et al., 2019), the ACS income dataset (Ding et al., 2021), the covid information study datasets (Pennycook et al., 2020; Roozenbeek et al., 2021), and the poverty mapping dataset (Yeh et al., 2020; Koh et al., 2021).
Dataset Splits Yes The split conformal inference ... begins by randomly splitting the training data into two folds, D(0) tr and D(1) tr , where n0 = |D(0) tr | and n1 = |D(1) tr |. It then uses D(0) tr for fitting the prediction function bµ : X 7 R and D(1) tr for obtaining the estimated quantile of Sn+1. For each run of under a simulation setting, a training set Dtr and a test set Dtest are generated, with |Dtr| = |Dtest| = 2000.
Hardware Specification No The paper mentions 'Wharton High Performance Computing for the computational resources' but does not specify any particular hardware components like GPU or CPU models.
Software Dependencies No The paper mentions using 'the scikit-learn package in python', 'python package qosa-indices', and 'XGBoost', but does not provide specific version numbers for these software components.
Experiment Setup Yes For all the candidate methods, we implement the split version, where half of the data is reserved for model fitting and the other half for calibration. The nonconformity score is s(x, y) = |y bµ(x)|... The robust parameter ρ {0.005, 0.01, . . . , 0.025} and the target coverage 90%.