reproducibilityindex.ai

Length Optimization in Conformal Prediction

Authors: Shayan Kiyani, George J. Pappas, Hamed Hassani

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive empirical evaluations demonstrate the superior prediction set size performance of CPL compared to state-of-the-art methods across diverse real-world and synthetic datasets in classification, regression, and large language model-based multiple choice question answering.
Researcher Affiliation	Academia	Shayan Kiyani, George Pappas, Hamed Hassani Department of Electrical and Systems Engineering University of Pennsylvania {shayank, pappasg, hassani}@seas.upenn.edu
Pseudocode	Yes	Algorithm 1 Conformal Prediction with Length-Optimization (CPL)
Open Source Code	Yes	An Implementation of our algorithm can be accessed at the following link: https://github.com/shayankiyani98/CP.
Open Datasets	Yes	We use multiple-choice question answering datasets, including Truthful QA [51], MMLU [52], Open Book QA [53], PIQA[54], and Big Bench [55].
Dataset Splits	Yes	We generate 150K training samples, 50K calibration data points, and 50K test data points.
Hardware Specification	No	The paper does not explicitly state the specific hardware used for running the experiments (e.g., CPU/GPU models, memory, or cloud instance types).
Software Dependencies	No	The paper mentions software like 'Python notebook', 'Llama 2', 'GPT-2', and 'Res Net50 model', but it does not provide specific version numbers for these or any other key software dependencies (e.g., PyTorch version, CUDA version).
Experiment Setup	Yes	We use a 2-hidden-layer NN with layers of 20 and 10 neurons for the inner maximization.