Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration

Authors: Shengjia Zhao, Michael Kim, Roshni Sahoo, Tengyu Ma, Stefano Ermon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our recalibration algorithm empirically: compared to existing methods, decision calibration improves decision-making on skin lesion and Image Net classification with modern neural network predictors.
Researcher Affiliation Academia Shengjia Zhao Stanford University sjzhao@stanford.edu Michael P. Kim UC Berkeley mpkim@berkeley.edu Roshni Sahoo Stanford University rsahoo@stanford.edu Tengyu Ma Stanford University tengyuma@stanford.edu Stefano Ermon Stanford University ermon@stanford.edu
Pseudocode Yes Algorithm 1: Recalibration algorithm to achieve LK decision calibration.
Open Source Code No The paper does not provide any specific statements or links regarding the release of open-source code for the described methodology.
Open Datasets Yes We use the HAM10000 dataset (Tschandl et al., 2018).
Dataset Splits Yes We partition the dataset into train/validation/test sets, where approximately 15% of the data are used for validation, while 10% are used for the test set.
Hardware Specification No The paper does not specify the hardware used for running experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies No The paper mentions 'pytorch' but does not specify version numbers for any software dependencies.
Experiment Setup Yes For modeling we use the densenet-121 architecture (Huang et al., 2017), which achieves around 90% accuracy. ... For these experiments we set the number of actions K = 3.