Conformal Prediction Sets Improve Human Decision Making
Authors: Jesse C. Cresswell, Yi Sui, Bhargava Kumar, Noël Vouitsis
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we study the usefulness of conformal prediction sets as an aid for human decision making by conducting a pre-registered randomized controlled trial with conformal prediction sets provided to human subjects. With statistical significance, we find that when humans are given conformal prediction sets their accuracy on tasks improves compared to fixed-size prediction sets with the same coverage guarantee. |
| Researcher Affiliation | Industry | Jesse C. Cresswell 1 Yi Sui 1 Bhargava Kumar 2 No el Vouitsis 1 1Layer 6 AI 2TD Securities. |
| Pseudocode | No | The paper describes the RAPS procedure and experimental methods in text but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Full details on dataset construction are given in Appendix A, and our code is available at this Git Hub repository for reproducibility. |
| Open Datasets | Yes | As a representative image classification task we used Object Net (Barbu et al., 2019)... As a prototypical task, we used Go Emotions (Demszky et al., 2020)... We used the Few-NERD dataset of sentences from Wikipedia in English (Ding et al., 2021)... |
| Dataset Splits | Yes | split the dataset into Dcal and Dtest maintaining class balance. |
| Hardware Specification | Yes | All pre-processing including calibration and generation of conformal sets was carried out with an Intel Xeon Silver 4114 CPU and TITAN V GPU and takes under 3 hours. |
| Software Dependencies | No | The paper mentions 'Psycho Py (Peirce et al., 2019)' and 'Pavlovia (Pavlovia, 2024)' as tools used, but does not provide specific version numbers for these or any other software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | We summarize the hyperparameters of our experiments in Table 6. Also shown is the empirical risk αˆ for top-k sets, which was then used as α for conformal calibration. |