Conformal Prediction with Learned Features

Authors: Shayan Kiyani, George J. Pappas, Hamed Hassani

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, our experimental results over four real-world and synthetic datasets show the superior performance of PLCP compared to state-of-the-art methods in terms of coverage and length in both classification and regression scenarios.
Researcher Affiliation Academia 1The Electrical and Systems Engineering Department, University of Pennsylvania, University of Pennsylvania, USA. Correspondence to: Shayan Kiyani <shayank@seas.upenn.edu>, George Pappas <pappasg@seas.upenn.edu>, Hamed Hassani <hassani@seas.upenn.edu>.
Pseudocode Yes Algorithm 1 Partition Learned Conformal Prediction (PLCP)
Open Source Code No No statement about open-source code release or repository links for the described methodology was found.
Open Datasets Yes We study the 2018 US Census Data from the Folktables library (Ding et al., 2021) for income prediction... We divide the MNIST dataset into 35,000 training images and 25,000 for calibration/testing. ... Our last experiment is on the Rx Rx1 dataset (Taylor et al., 2019) from the WILDS repository (Koh et al., 2021)...
Dataset Splits Yes Data are divided into three segments: 60% for training, 20% for calibration, and 20% for testing. ... These 25,000 blurred images are then randomly divided into a 15,000-image calibration set and a 10,000-image test set. ... We generate from this distribution 150K training samples (to train the regression model to predict the label), 50K calibration data points, and 50K test data points.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments were provided in the paper.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) were explicitly mentioned for reproducibility.
Experiment Setup Yes In all experiments, we set the miscoverage rate, α, to 0.1. ... PLCP is implemented using a two-layer Re LU neural network (200 and 100 neurons). ... PLCP is implemented with m = 8 to denote eight distinct groups, employing a Convolutional Neural Network (CNN) architecture with three convolution layers and two feed-forward layers. ... We ran PLCP with m = 25 (25 groups), using a linear classifier (as H). ... For this experiment, we ran PLCP using a CNN with a single convolution layer, with Re LU activation followed by a linear layer, configured with m = 20 groups. ... For genetic treatment prediction (the predictive model), we employ a Res Net50 architecture, f(x), pre-trained on 37 experiments from the WILDS repository. ... To identify the optimal m, we employ the doubling trick: setting aside 20 percent of the calibration data for validation, we increment m from a small value, evaluate PLCP on the validation set, and continue doubling m until the validation metric worsens. We then fine-tune by bisecting between the last two m values.