Conformal Prediction via Regression-as-Classification
Authors: Etash Kumar Guha, Shlok Natarajan, Thomas Möllenhoff, Mohammad Emtiyaz Khan, Eugene Ndiaye
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on many benchmarks shows that this simple approach gives surprisingly good results on many practical problems. We investigate the empirical behavior of our R2CCP (Regression-to-Classification Conformal Prediction) method, which we have explained in detail in Algorithm 1. We have three sets of experiments. |
| Researcher Affiliation | Collaboration | Etash Guha RIKEN Center for AI Project, Samba Nova Systems {etash.guha}@sambanovasystems.com Shlok Natarajan Salesforce {shloknatarajan}@salesforce.com Thomas M ollenhoff RIKEN Center for AI Project {thomas.moellenhoff}@riken.jp Mohammad Emtiyaz Khan RIKEN Center for AI Project {emtiyaz.khan}@riken.jp Eugene Ndiaye Apple {e ndiaye}@apple.com |
| Pseudocode | Yes | Algorithm 1 Regression to Classication Conformal Prediction (R2CCP). |
| Open Source Code | Yes | All code to run our method can be installed via pip install r2ccp. |
| Open Datasets | Yes | Specifically, these are several datasets from the UCI Machine Learning repository (Bio, Blog, Concrete, Community, Energy, Forest, Stock, Cancer, Solar, Parkinsons, Pendulum) (Nottingham et al., 2023) and the medical expenditure panel survey number 19 21 (MEPS-19 21) (Cohen et al., 2009). |
| Dataset Splits | No | The paper states that data is 'Randomly split the dataset Dn in training Dtr and calibration Dcal' (Algorithm 1, step 4) but does not provide specific percentages or absolute counts for these splits, nor does it refer to predefined standard splits with their ratios. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for experiments, such as GPU/CPU models, memory, or cloud instance types. |
| Software Dependencies | No | The paper mentions 'pip install r2ccp' and 'AdamW as an optimizer' but does not specify version numbers for Python, deep learning frameworks (e.g., PyTorch, TensorFlow), or other key software dependencies. |
| Experiment Setup | Yes | We do not tune the hyperparameters and keep values of K = 50, p = 0.5, and τ = 0.2 constant across all experiments. For all experiments, we report length, meaning the length of all the sets predicted, and coverage, the percent of instances where the true label is contained in the predicted intervals. ... Specifically, we discretize the range space into K = 50 points, weight the entropy term by τ = 0.2, use a 1000 hidden dimension, use 4 layers, use weight decay of 1e 4, use p = .5, and use Adam W as an optimizer. For most of the experiments, we use learning rate 1e 4 and batch size 32. However, for certain datasets, namely the MEPS datasets, we used a larger batch size of 256 to improve training time and used a smaller learning rate to prevent training divergence. |