PAC Prediction Sets Under Label Shift
Authors: Wenwen Si, Sangdon Park, Insup Lee, Edgar Dobriban, Osbert Bastani
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate our approach on four datasets: the CIFAR-10 and Chest X-Ray image datasets, the tabular CDC Heart Dataset, and the AGNews text dataset. Our algorithm satisfies the PAC guarantee while producing smaller prediction set sizes compared to several baselines. |
| Researcher Affiliation | Academia | 1Department of Computer & Information Science, University of Pennsylvania 2Graduate School of AI and Computer Science & Engineering, POSTECH 3Department of Statistics and Data Science, University of Pennsylvania |
| Pseudocode | Yes | Algorithm 1 PS-W: PAC prediction sets in the label shift setting. |
| Open Source Code | Yes | Reproducibility statement. Our code is available at https://github.com/averysi224/pac-ps-label-shift for reproducing our experiments. |
| Open Datasets | Yes | We empirically evaluate our approach on four datasets: the CIFAR-10 (Krizhevsky et al., 2009) and Chest X-Ray (National Institutes of Health and others, 2022) image datasets, the tabular CDC Heart Dataset (Centers for Disease Control and Prevention (CDC), 1984), and the AGNews text dataset (Zhang et al., 2015). |
| Dataset Splits | Yes | First, we split the full dataset into training data, and base source and target datasets. We use the training dataset to fit a score function. Given label distributions PY and QY , we generate the source dataset Sm, target dataset T X n , and a labeled, size o target test dataset (sampled from Q) by sampling with replacement from the corresponding base datasets. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) are provided for running experiments. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library versions, programming language versions) are mentioned. |
| Experiment Setup | Yes | We use a two-layer MLP for the CDC Heart data with an SGD optimizer having a learning rate of 0.03 and a momentum of 0.9, using a batch size of 64 for 30 epochs. For CIFAR-10, we finetune a pretrained Res Net50 He et al. (2016), with a learning rate of 0.01 for 56 epochs. |