PAC-Bayes Generalization Certificates for Learned Inductive Conformal Prediction
Authors: Apoorva Sharma, Sushant Veer, Asher Hancock, Heng Yang, Marco Pavone, Anirudha Majumdar
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the approach on regression and classification tasks, and outperform baselines calibrated using a Hoeffding bound-based PAC guarantee on ICP, especially in the low-data regime. |
| Researcher Affiliation | Collaboration | Apoorva Sharma NVIDIA Research apoorvas@nvidia.com Sushant Veer NVIDIA Research sveer@nvidia.com Asher Hancock Princeton University ah4775@princeton.edu Heng Yang Harvard University & NVIDIA Research hengy@nvidia.com Marco Pavone Stanford University & NVIDIA Research mpavone@nvidia.com Anirudha Majumdar Princeton University ani.majumdar@princeton.edu |
| Pseudocode | Yes | The overall algorithm is summarized in Alg. 1. Algorithm 1 Optimal Conformal Prediction with Generalization Guarantees |
| Open Source Code | Yes | We implemented our approach using Py Torch [Paszke et al., 2019] and Hydra [Yadan, 2019]; code to run all experiments is available at https://github.com/NVlabs/pac-bayes-conformal-prediction |
| Open Datasets | Yes | As the base predictor, we use a Le Net convolutional neural network trained on a softmax objective to classify noise-free MNIST digits [Le Cun et al., 1998]. |
| Dataset Splits | Yes | In the calibration phase, we first split the calibration data Dcal into two random splits, a tuning dataset D0 and true calibration dataset DN, where the fraction of data used for tuning (the data split) is a hyperparameter. |
| Hardware Specification | Yes | All experiments were performed on a workstation with a Intel Core i9-10980XE CPU with a NVIDIA GeForce RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions using 'Py Torch [Paszke et al., 2019] and Hydra [Yadan, 2019]' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | For the learned model, we optimize the efficiency loss for 2000 steps with a learning rate of 1e-3 and a batch size of 100 samples. For the PAC-Bayes approach, we optimize using an augmented Lagrangian method, using 2000 steps of gradient descent with a learning rate of 1e-3 to solve the unconstrained penalized problem, running 7 outer iterations... where the temperature T is a hyperparameter which controls the smoothness of the approximation; in our experiments we use T = 0.1. |