Conformal Prediction Sets with Limited False Positives
Authors: Adam Fisch, Tal Schuster, Tommi Jaakkola, Dr.Regina Barzilay
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of this approach across a number of classification tasks in natural language processing, computer vision, and computational chemistry. and We now present our empirical results. |
| Researcher Affiliation | Collaboration | 1CSAIL, Massachusetts Institute of Technology. 2Google Research. |
| Pseudocode | Yes | Algorithm 1 Pseudocode for conformal prediction with limited false positives (in expectation case, see Eq. (1)). |
| Open Source Code | Yes | Our code is publicly available at https: //github.com/ajfisch/conformal-fp. |
| Open Datasets | Yes | All datasets and base models used in this paper are publicly available (see 5.1 and Appendix C for details). and We use the Ch EMBL database (Mayr et al., 2018)... We use the MS-COCO dataset (Lin et al., 2014)... We use the Co NLL NER dataset (Tjong Kim Sang and De Meulder, 2003)... |
| Dataset Splits | Yes | For each task we learn all models on a training set, perform model selection on a validation set, and report final results as the average over 1000 random trials on a test set, where in each trial we partition the data into 80% calibration (x1:n) and 20% prediction points (xn+1). |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper refers to external models and repositories used (e.g., chemprop3, EfficientDet, PURE, Albert-base) but does not list specific version numbers for its own software dependencies such as Python, PyTorch, or other libraries. |
| Experiment Setup | Yes | We use the official code repository and the following parameters: 1e 5 learning rate, 5e 4 task learning rate, 32 train batch size, and 100 context window. |