reproducibilityindex.ai

Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration

Authors: Shengjia Zhao, Michael Kim, Roshni Sahoo, Tengyu Ma, Stefano Ermon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our recalibration algorithm empirically: compared to existing methods, decision calibration improves decision-making on skin lesion and Image Net classiﬁcation with modern neural network predictors.
Researcher Affiliation	Academia	Shengjia Zhao Stanford University sjzhao@stanford.edu Michael P. Kim UC Berkeley mpkim@berkeley.edu Roshni Sahoo Stanford University rsahoo@stanford.edu Tengyu Ma Stanford University tengyuma@stanford.edu Stefano Ermon Stanford University ermon@stanford.edu
Pseudocode	Yes	Algorithm 1: Recalibration algorithm to achieve LK decision calibration.
Open Source Code	No	The paper does not provide any specific statements or links regarding the release of open-source code for the described methodology.
Open Datasets	Yes	We use the HAM10000 dataset (Tschandl et al., 2018).
Dataset Splits	Yes	We partition the dataset into train/validation/test sets, where approximately 15% of the data are used for validation, while 10% are used for the test set.
Hardware Specification	No	The paper does not specify the hardware used for running experiments (e.g., GPU models, CPU types, or memory).
Software Dependencies	No	The paper mentions 'pytorch' but does not specify version numbers for any software dependencies.
Experiment Setup	Yes	For modeling we use the densenet-121 architecture (Huang et al., 2017), which achieves around 90% accuracy. ... For these experiments we set the number of actions K = 3.