reproducibilityindex.ai

Conformal Prediction Under Covariate Shift

Authors: Ryan J. Tibshirani, Rina Foygel Barber, Emmanuel Candes, Aaditya Ramdas

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate conformal prediction in the covariate shift setting using an empirical example. We consider the airfoil data set from the UCI Machine Learning Repository [Dua and Graff, 2019], which has N = 1503 observations of a response Y (scaled sound pressure level of NASA airfoils), and a covariate X with d = 5 dimensions (log frequency, angle of attack, chord length, free-stream velocity, and suction side log displacement thickness). For efficiency, we use a variant of conformal prediction called split conformal prediction [Papadopoulos et al., 2002, Lei et al., 2015], which we extend to the covariate shift case in the same way (using weighted quantiles); see the supplement. For R code to reproduce the results that follow, see http://www.github.com/ryantibs/conformal/. Creating training data, test data, and covariate shift. We repeated an experiment for 5000 trials, where for each trial we randomly partitioned the data {(Xi, Yi)}N i=1 into two sets Dtrain, Dtest, and also constructed a covariate shift test set Dshift, which have the following roles.
Researcher Affiliation	Academia	Ryan J. Tibshirani Department of Statistics Machine Learning Department Carnegie Mellon University Pittsburgh PA, 15213 ryantibs@cmu.edu Rina Foygel Barber Department of Statistics University of Chicago Chicago, IL 60637 rina@uchicago.edu Emmanuel J. Candès Department of Statistics Department of Mathematics Stanford University Stanford CA, 94305 candes@stanford.edu Aaditya Ramdas Department of Statistics Machine Learning Department Carnegie Mellon University Pittsburgh PA, 15213 aramdas@cmu.edu
Pseudocode	No	The paper describes procedures using mathematical formulas and text but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	For R code to reproduce the results that follow, see http://www.github.com/ryantibs/conformal/.
Open Datasets	Yes	We consider the airfoil data set from the UCI Machine Learning Repository [Dua and Graff, 2019]
Dataset Splits	Yes	Creating training data, test data, and covariate shift. We repeated an experiment for 5000 trials, where for each trial we randomly partitioned the data {(Xi, Yi)}N i=1 into two sets Dtrain, Dtest, and also constructed a covariate shift test set Dshift, which have the following roles. Dtrain, containing 50% of the data, is our training set, i.e., (Xi, Yi), i = 1, . . . , n, used to compute conformal prediction intervals (using the split conformal variant). Dtest, containing 50% of the data, is our test set (as these data points are exchangeable with those in Dtrain, there is no covariate shift in this test set). Dshift is a second test set, constructed to simulate covariate shift, by sampling 25% of the points from Dtest with replacement, with probabilities proportional to w(x) = exp(x T β), where β = ( 1, 0, 0, 0, 1).
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory, or cloud instance types) used for the experiments were provided.
Software Dependencies	No	The paper mentions “R code to reproduce the results” but does not specify the version of R or any specific R packages with version numbers.
Experiment Setup	Yes	The nominal coverage level was set to be 90% (meaning α = 0.1), here and throughout. (...) Dshift is a second test set, constructed to simulate covariate shift, by sampling 25% of the points from Dtest with replacement, with probabilities proportional to w(x) = exp(x T β), where β = ( 1, 0, 0, 0, 1). (...) The bottom row of Figure 1 shows the results from using weighted split conformal prediction to cover the points in Dshift, where the weight function bw has been estimated as in (9), using logistic regression (in gray) and random forests4 (in green) to fit the class probability function bp. (...) In the random forests approach, we clipped the estimated test class probability bp(x) to lie in between 0.01 and 0.99, to prevent the estimated weight (likelihood ratio) bw(x) from being infinite.