Uncertainty Sets for Image Classifiers using Conformal Prediction

Authors: Anastasios Nikolas Angelopoulos, Stephen Bates, Michael Jordan, Jitendra Malik

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments on both Imagenet and Imagenet-V2 with Res Net-152 and other classifiers, our scheme outperforms existing approaches, achieving coverage with sets that are often factors of 5 to 10 smaller than a stand-alone Platt scaling baseline.
Researcher Affiliation Academia Anastasios N. Angelopoulos , Stephen Bates , Jitendra Malik, & Michael I. Jordan Departments of Electrical Engineering and Computer Sciences and Statistics University of California, Berkeley {angelopoulos,stephenbates,malik,jordan}@cs.berkeley.edu
Pseudocode Yes Algorithm 1 Naive Prediction Sets; Algorithm 2 RAPS Conformal Calibration; Algorithm 3 RAPS Prediction Sets; Algorithm 4 Adaptive Fixed-K
Open Source Code Yes We will provide an accompanying codebase that implements our method as a wrapper for any Py Torch classifier, along with code to exactly reproduce all of our experiments.
Open Datasets Yes For evaluations, we focus on Imagenet classification; In experiments on both Imagenet and Imagenet-V2 with Res Net-152 and other classifiers
Dataset Splits Yes Over 100 trials, we randomly sampled two subsets of Imagenet-Val: one conformal calibration subset of size 20K and one evaluation subset of size 20K.
Hardware Specification No The paper does not specify any particular hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications).
Software Dependencies No The paper mentions using a 'Py Torch classifier' but does not provide specific version numbers for PyTorch or any other software libraries or dependencies.
Experiment Setup Yes In our experiments, we use nine standard, pretrained Imagenet classifiers from the torchvision repository (Paszke et al., 2019) with standard normalization, resize, and crop parameters. Before applying naive, APS, or RAPS, we calibrated the classifiers using the standard temperature scaling/Platt scaling procedure as in Guo et al. (2017) on the calibration set. Thereafter, naive, APS, and RAPS were applied, with RAPS using a data-driven choice of parameters described in Appendix E.