reproducibilityindex.ai

Conformal Prediction Sets Improve Human Decision Making

Authors: Jesse C. Cresswell, Yi Sui, Bhargava Kumar, Noël Vouitsis

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we study the usefulness of conformal prediction sets as an aid for human decision making by conducting a pre-registered randomized controlled trial with conformal prediction sets provided to human subjects. With statistical significance, we find that when humans are given conformal prediction sets their accuracy on tasks improves compared to fixed-size prediction sets with the same coverage guarantee.
Researcher Affiliation	Industry	Jesse C. Cresswell 1 Yi Sui 1 Bhargava Kumar 2 No el Vouitsis 1 1Layer 6 AI 2TD Securities.
Pseudocode	No	The paper describes the RAPS procedure and experimental methods in text but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Full details on dataset construction are given in Appendix A, and our code is available at this Git Hub repository for reproducibility.
Open Datasets	Yes	As a representative image classification task we used Object Net (Barbu et al., 2019)... As a prototypical task, we used Go Emotions (Demszky et al., 2020)... We used the Few-NERD dataset of sentences from Wikipedia in English (Ding et al., 2021)...
Dataset Splits	Yes	split the dataset into Dcal and Dtest maintaining class balance.
Hardware Specification	Yes	All pre-processing including calibration and generation of conformal sets was carried out with an Intel Xeon Silver 4114 CPU and TITAN V GPU and takes under 3 hours.
Software Dependencies	No	The paper mentions 'Psycho Py (Peirce et al., 2019)' and 'Pavlovia (Pavlovia, 2024)' as tools used, but does not provide specific version numbers for these or any other software dependencies like programming languages or libraries.
Experiment Setup	Yes	We summarize the hyperparameters of our experiments in Table 6. Also shown is the empirical risk αˆ for top-k sets, which was then used as α for conformal calibration.