When Does Optimizing a Proper Loss Yield Calibration?
Authors: Jaroslaw Blasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we provide a rigorous answer to these questions. We show that any predictor with this local optimality satisfies smooth calibration as defined in Kakade and Foster [2008], Błasiok et al. [2023b]. Local optimality is plausibly satisfied by well-trained DNNs, which suggests an explanation for why they are calibrated from proper loss minimization alone. Finally, we show that the connection between local optimality and calibration error goes both ways: nearly calibrated predictors are also nearly locally optimal. The goal of our work is to analyze this phenomena from a theoretical perspective. What minimal conditions on model family and training procedure guarantee that optimizing for a proper loss provably yields small calibration error? |
| Researcher Affiliation | Collaboration | Jarosław Błasiok Columbia University jb4451@columbia.edu Parikshit Gopalan Apple parik.g@gmail.com Lunjia Hu Stanford University lunjia@stanford.edu Preetum Nakkiran Apple preetum.nakkiran@gmail.com |
| Pseudocode | Yes | Algorithm 1: Local search for small p Gap. Algorithm 2: Regularized loss minimization for small p Gap. |
| Open Source Code | No | The paper does not provide any statements about releasing source code for the methodology described, nor does it include links to a code repository. |
| Open Datasets | No | This paper is theoretical and does not conduct empirical experiments on specific datasets. It discusses distributions and samples from a theoretical perspective rather than using concrete, publicly available datasets for training or evaluation. |
| Dataset Splits | No | This paper is theoretical and does not involve empirical model training or validation, hence it does not provide dataset split information. |
| Hardware Specification | No | The paper is theoretical and does not describe any experiments that would require specific hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not describe any software implementations or dependencies with specific version numbers. |
| Experiment Setup | No | The paper is theoretical and does not conduct experiments, therefore no experimental setup details such as hyperparameters or training settings are provided. |