Length Optimization in Conformal Prediction

Authors: Shayan Kiyani, George J. Pappas, Hamed Hassani

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive empirical evaluations demonstrate the superior prediction set size performance of CPL compared to state-of-the-art methods across diverse real-world and synthetic datasets in classification, regression, and large language model-based multiple choice question answering.
Researcher Affiliation Academia Shayan Kiyani, George Pappas, Hamed Hassani Department of Electrical and Systems Engineering University of Pennsylvania {shayank, pappasg, hassani}@seas.upenn.edu
Pseudocode Yes Algorithm 1 Conformal Prediction with Length-Optimization (CPL)
Open Source Code Yes An Implementation of our algorithm can be accessed at the following link: https://github.com/shayankiyani98/CP.
Open Datasets Yes We use multiple-choice question answering datasets, including Truthful QA [51], MMLU [52], Open Book QA [53], PIQA[54], and Big Bench [55].
Dataset Splits Yes We generate 150K training samples, 50K calibration data points, and 50K test data points.
Hardware Specification No The paper does not explicitly state the specific hardware used for running the experiments (e.g., CPU/GPU models, memory, or cloud instance types).
Software Dependencies No The paper mentions software like 'Python notebook', 'Llama 2', 'GPT-2', and 'Res Net50 model', but it does not provide specific version numbers for these or any other key software dependencies (e.g., PyTorch version, CUDA version).
Experiment Setup Yes We use a 2-hidden-layer NN with layers of 20 and 10 neurons for the inner maximization.