Beyond calibration: estimating the grouping loss of modern neural networks
Authors: Alexandre Perez-Lebel, Marine Le Morvan, Gael Varoquaux
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate on simulations that the proposed estimator can provide tight lower-bounds on the grouping loss (Section 5.1). We evidence for the first time the presence of grouping loss on pre-trained vision and language architectures, notably in distribution shifts settings (Section 5.2). |
| Researcher Affiliation | Academia | Alexandre Perez-Lebel, Marine Le Morvan, Gaƫl Varoquaux Soda project team, Inria Saclay, Palaiseau, France |
| Pseudocode | No | The paper describes methods in text and mathematical formulations but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code for the implementation of the algorithm, experiments, simulations and figures is available on Git Hub: https://github.com/aperezlebel/beyond_calibration. |
| Open Datasets | Yes | All datasets are publicly available (Image Net-R, Image Net-C, Image Net-1K, Yahoo Answers Topics) |
| Dataset Splits | Yes | We divide the samples of the evaluation set in half making sure that the confidence score distribution is the same in both resulting subsets. On one set, we train the isotonic regression for calibration and calibrate the confidence scores of both sets. [...] with a 50-50 train-test split strategy. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for running the experiments. |
| Software Dependencies | Yes | Models architectures and weights are available on Py Torch v0.12 (Paszke et al., 2019). |
| Experiment Setup | Yes | We build confidence scores by applying a softmax to the output logits. We extract a representation of the input images in the high-level feature space of the network... We divide the samples... in half... we train the isotonic regression... Then, we create groups of same-level confidences by binning the confidence scores with 15 equal-width bins in [0, 1]... constrained to one balanced split, with a 50-50 train-test split strategy... typically targeting a region ratio of a dozen, to obtain the best possible lower bound c GLLB. |