Calibrated Learning to Defer with One-vs-All Classifiers
Authors: Rajeev Verma, Eric Nalisnick
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments verify that not only is our system calibrated, but this benefit comes at no cost to accuracy. Our model s accuracy is always comparable (and often superior) to Mozannar & Sontag s (2020) model s in tasks ranging from hate speech detection to galaxy classification to diagnosis of skin lesions. |
| Researcher Affiliation | Academia | Rajeev Verma 1 Eric Nalisnick 1 Informatics Institute, University of Amsterdam, Amsterdam, Netherlands. Correspondence to: Rajeev Verma <rajeev.ee15@gmail.com>, Eric Nalisnick <e.t.nalisnick@uva.nl>. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our software implementations are publicly available.1 https://github.com/rajevv/Ov A-L2D |
| Open Datasets | Yes | We use the standard train-test splits of CIFAR-10 (Krizhevsky, 2009). We also use HAM10000 (Tschandl et al., 2018), Galaxy-Zoo (Bamford et al., 2009), and Hate Speech (Davidson et al., 2017) datasets. |
| Dataset Splits | Yes | We further partition the training split by 90% 10% to form training and validation sets, respectively. We partition the data into 60% training, 20% validation, and 20% test splits. |
| Hardware Specification | No | The paper states training was done using |
| Software Dependencies | No | The paper mentions using SGD and Adam optimizers, Wide Residual Networks, MLPMixer, and ResNet34 models, but does not specify software dependencies with version numbers (e.g., PyTorch 1.9, Python 3.8). |
| Experiment Setup | Yes | We use SGD with a momentum of 0.9, weight decay 5e 4, and initial learning rate of 0.1. We further use cosine annealing learning rate schedule. We train this model with Adam optimization algorithm with a learning rate of 0.001, weight decay of 5e 4. We further use cosine annealing learning rate schedule with a warm-up period of 5 epochs. |