When Does Optimizing a Proper Loss Yield Calibration?

Authors: Jaroslaw Blasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work, we provide a rigorous answer to these questions. We show that any predictor with this local optimality satisfies smooth calibration as defined in Kakade and Foster [2008], Błasiok et al. [2023b]. Local optimality is plausibly satisfied by well-trained DNNs, which suggests an explanation for why they are calibrated from proper loss minimization alone. Finally, we show that the connection between local optimality and calibration error goes both ways: nearly calibrated predictors are also nearly locally optimal. The goal of our work is to analyze this phenomena from a theoretical perspective. What minimal conditions on model family and training procedure guarantee that optimizing for a proper loss provably yields small calibration error?
Researcher Affiliation Collaboration Jarosław Błasiok Columbia University jb4451@columbia.edu Parikshit Gopalan Apple parik.g@gmail.com Lunjia Hu Stanford University lunjia@stanford.edu Preetum Nakkiran Apple preetum.nakkiran@gmail.com
Pseudocode Yes Algorithm 1: Local search for small p Gap. Algorithm 2: Regularized loss minimization for small p Gap.
Open Source Code No The paper does not provide any statements about releasing source code for the methodology described, nor does it include links to a code repository.
Open Datasets No This paper is theoretical and does not conduct empirical experiments on specific datasets. It discusses distributions and samples from a theoretical perspective rather than using concrete, publicly available datasets for training or evaluation.
Dataset Splits No This paper is theoretical and does not involve empirical model training or validation, hence it does not provide dataset split information.
Hardware Specification No The paper is theoretical and does not describe any experiments that would require specific hardware specifications.
Software Dependencies No The paper is theoretical and does not describe any software implementations or dependencies with specific version numbers.
Experiment Setup No The paper is theoretical and does not conduct experiments, therefore no experimental setup details such as hyperparameters or training settings are provided.