PAC Confidence Sets for Deep Neural Networks via Calibrated Prediction
Authors: Sangdon Park, Osbert Bastani, Nikolai Matni, Insup Lee
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we evaluate our approach on three benchmarks: Res Net (He et al., 2016) for Image Net (Russakovsky et al., 2015), a model (Held et al., 2016) learned for a visual object tracking benchmark (Wu et al., 2013), and a probabilistic dynamics model (Chua et al., 2018) learned for the half-cheetah environment (Brockman et al., 2016) (Section 4). |
| Researcher Affiliation | Academia | Sangdon Park University of Pennsylvania sangdonp@cis.upenn.edu Osbert Bastani University of Pennsylvania obastani@seas.upenn.edu Nikolai Matni University of Pennsylvania nmatni@seas.upenn.edu Insup Lee University of Pennsylvania lee@cis.upenn.edu |
| Pseudocode | Yes | Algorithm 1 Algorithm for solving (3). procedure ESTIMATECONFIDENCESETPREDICTOR(Ztrain, Z train, Zval) |
| Open Source Code | Yes | 1Our code is available at https://github.com/sangdon/PAC-confidence-set. |
| Open Datasets | Yes | Finally, we evaluate our approach on three benchmarks: Res Net (He et al., 2016) for Image Net (Russakovsky et al., 2015), a model (Held et al., 2016) learned for a visual object tracking benchmark (Wu et al., 2013), and a probabilistic dynamics model (Chua et al., 2018) learned for the half-cheetah environment (Brockman et al., 2016) (Section 4). |
| Dataset Splits | Yes | We randomly split these sequences to form the training set for calibration, validation set for confidence set estimation, and test set for evaluation. For each sequence, a pair of two adjacent frames constitute a single example. Our training dataset contains 20,882 labeled examples, each consisting of of a pair of consecutive images and ground truth bounding boxes. The validation set for confidence set estimation and test set contain 22,761 and 22,761 labeled examples, respectively. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments. It only discusses the neural networks and datasets used. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments. It mentions various models and datasets but no software versions. |
| Experiment Setup | Yes | We use our algorithm to compute confidence sets for Res Net (He et al., 2016) on Image Net (Russakovsky et al., 2015), for ϵ = 0.01, δ = 10 5, and n = 20000 validation images. |