Conformal Prediction Sets Improve Human Decision Making

Authors: Jesse C. Cresswell, Yi Sui, Bhargava Kumar, Noël Vouitsis

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we study the usefulness of conformal prediction sets as an aid for human decision making by conducting a pre-registered randomized controlled trial with conformal prediction sets provided to human subjects. With statistical significance, we find that when humans are given conformal prediction sets their accuracy on tasks improves compared to fixed-size prediction sets with the same coverage guarantee.
Researcher Affiliation Industry Jesse C. Cresswell 1 Yi Sui 1 Bhargava Kumar 2 No el Vouitsis 1 1Layer 6 AI 2TD Securities.
Pseudocode No The paper describes the RAPS procedure and experimental methods in text but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Full details on dataset construction are given in Appendix A, and our code is available at this Git Hub repository for reproducibility.
Open Datasets Yes As a representative image classification task we used Object Net (Barbu et al., 2019)... As a prototypical task, we used Go Emotions (Demszky et al., 2020)... We used the Few-NERD dataset of sentences from Wikipedia in English (Ding et al., 2021)...
Dataset Splits Yes split the dataset into Dcal and Dtest maintaining class balance.
Hardware Specification Yes All pre-processing including calibration and generation of conformal sets was carried out with an Intel Xeon Silver 4114 CPU and TITAN V GPU and takes under 3 hours.
Software Dependencies No The paper mentions 'Psycho Py (Peirce et al., 2019)' and 'Pavlovia (Pavlovia, 2024)' as tools used, but does not provide specific version numbers for these or any other software dependencies like programming languages or libraries.
Experiment Setup Yes We summarize the hyperparameters of our experiments in Table 6. Also shown is the empirical risk αˆ for top-k sets, which was then used as α for conformal calibration.