PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction

Authors: Sangdon Park, Osbert Bastani, Nikolai Matni, Insup Lee

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we evaluate our approach on three benchmarks: Res Net (He et al., 2016) for Image Net (Russakovsky et al., 2015), a model (Held et al., 2016) learned for a visual object tracking benchmark (Wu et al., 2013), and a probabilistic dynamics model (Chua et al., 2018) learned for the half-cheetah environment (Brockman et al., 2016) (Section 4).
Researcher Affiliation Academia Sangdon Park University of Pennsylvania sangdonp@cis.upenn.edu Osbert Bastani University of Pennsylvania obastani@seas.upenn.edu Nikolai Matni University of Pennsylvania nmatni@seas.upenn.edu Insup Lee University of Pennsylvania lee@cis.upenn.edu
Pseudocode Yes Algorithm 1 Algorithm for solving (3). procedure ESTIMATECONFIDENCESETPREDICTOR(Ztrain, Z train, Zval)
Open Source Code Yes 1Our code is available at https://github.com/sangdon/PAC-confidence-set.
Open Datasets Yes Finally, we evaluate our approach on three benchmarks: Res Net (He et al., 2016) for Image Net (Russakovsky et al., 2015), a model (Held et al., 2016) learned for a visual object tracking benchmark (Wu et al., 2013), and a probabilistic dynamics model (Chua et al., 2018) learned for the half-cheetah environment (Brockman et al., 2016) (Section 4).
Dataset Splits Yes We randomly split these sequences to form the training set for calibration, validation set for confidence set estimation, and test set for evaluation. For each sequence, a pair of two adjacent frames constitute a single example. Our training dataset contains 20,882 labeled examples, each consisting of of a pair of consecutive images and ground truth bounding boxes. The validation set for confidence set estimation and test set contain 22,761 and 22,761 labeled examples, respectively.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. It only discusses the neural networks and datasets used.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments. It mentions various models and datasets but no software versions.
Experiment Setup Yes We use our algorithm to compute confidence sets for Res Net (He et al., 2016) on Image Net (Russakovsky et al., 2015), for ϵ = 0.01, δ = 10 5, and n = 20000 validation images.