reproducibilityindex.ai

PAC Prediction Sets Under Label Shift

Authors: Wenwen Si, Sangdon Park, Insup Lee, Edgar Dobriban, Osbert Bastani

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluate our approach on four datasets: the CIFAR-10 and Chest X-Ray image datasets, the tabular CDC Heart Dataset, and the AGNews text dataset. Our algorithm satisfies the PAC guarantee while producing smaller prediction set sizes compared to several baselines.
Researcher Affiliation	Academia	1Department of Computer & Information Science, University of Pennsylvania 2Graduate School of AI and Computer Science & Engineering, POSTECH 3Department of Statistics and Data Science, University of Pennsylvania
Pseudocode	Yes	Algorithm 1 PS-W: PAC prediction sets in the label shift setting.
Open Source Code	Yes	Reproducibility statement. Our code is available at https://github.com/averysi224/pac-ps-label-shift for reproducing our experiments.
Open Datasets	Yes	We empirically evaluate our approach on four datasets: the CIFAR-10 (Krizhevsky et al., 2009) and Chest X-Ray (National Institutes of Health and others, 2022) image datasets, the tabular CDC Heart Dataset (Centers for Disease Control and Prevention (CDC), 1984), and the AGNews text dataset (Zhang et al., 2015).
Dataset Splits	Yes	First, we split the full dataset into training data, and base source and target datasets. We use the training dataset to fit a score function. Given label distributions PY and QY , we generate the source dataset Sm, target dataset T X n , and a labeled, size o target test dataset (sampled from Q) by sampling with replacement from the corresponding base datasets.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) are provided for running experiments.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., library versions, programming language versions) are mentioned.
Experiment Setup	Yes	We use a two-layer MLP for the CDC Heart data with an SGD optimizer having a learning rate of 0.03 and a momentum of 0.9, using a batch size of 64 for 30 epochs. For CIFAR-10, we finetune a pretrained Res Net50 He et al. (2016), with a learning rate of 0.01 for 56 epochs.