Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration
Authors: Shengjia Zhao, Michael Kim, Roshni Sahoo, Tengyu Ma, Stefano Ermon
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our recalibration algorithm empirically: compared to existing methods, decision calibration improves decision-making on skin lesion and Image Net classification with modern neural network predictors. |
| Researcher Affiliation | Academia | Shengjia Zhao Stanford University EMAIL Michael P. Kim UC Berkeley EMAIL Roshni Sahoo Stanford University EMAIL Tengyu Ma Stanford University EMAIL Stefano Ermon Stanford University EMAIL |
| Pseudocode | Yes | Algorithm 1: Recalibration algorithm to achieve LK decision calibration. |
| Open Source Code | No | The paper does not provide any specific statements or links regarding the release of open-source code for the described methodology. |
| Open Datasets | Yes | We use the HAM10000 dataset (Tschandl et al., 2018). |
| Dataset Splits | Yes | We partition the dataset into train/validation/test sets, where approximately 15% of the data are used for validation, while 10% are used for the test set. |
| Hardware Specification | No | The paper does not specify the hardware used for running experiments (e.g., GPU models, CPU types, or memory). |
| Software Dependencies | No | The paper mentions 'pytorch' but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | For modeling we use the densenet-121 architecture (Huang et al., 2017), which achieves around 90% accuracy. ... For these experiments we set the number of actions K = 3. |