Dual Focal Loss for Calibration

Authors: Linwei Tao, Minjing Dong, Chang Xu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical evidence to support our approach and demonstrate its effectiveness through evaluations on multiple models and datasets, where it achieves state-of-the-art performance. Code is available at https://github.com/Linwei94/Dual Focal Loss
Researcher Affiliation Academia 1School of Computing Science, University of Sydney. Correspondence to: Linwei Tao <linwei.tao@sydney.edu.au>, Chang Xu <c.xu@sydney.edu.au>.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/Linwei94/Dual Focal Loss
Open Datasets Yes Our experiments are conducted on CIFAR-10/100 (Krizhevsky et al., 2009) and Tiny Image Net (Deng et al., 2009) for calibration performance. The SVHN dataset, a dataset of street view house numbers and the CIFAR-10-C dataset, a corrupted version of the CIFAR-10 dataset are used as Out-of-Distribution (Oo D) datasets for evaluating the robustness of models.
Dataset Splits Yes We train CIFAR-10/100 for 350 epochs, using 5000 images from the training set for validation.
Hardware Specification Yes We conduct all experiments on a single Tesla V-100 GPU with all random seeds set to 1.
Software Dependencies No The paper mentions implementing the algorithm and using SGD, but it does not specify version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup Yes We train CIFAR-10/100 for 350 epochs, using 5000 images from the training set for validation. The learning rate is set to 0.1 for the first 150 epochs, 0.01 for the following 100 epochs, and 0.001 for the remaining epochs. For Tiny-Image Net, we train for 100 epochs, with the learning rate set to 0.1 for the first 40 epochs, 0.01 for the following 20 epochs, and 0.001 for the remaining epochs. We use SGD with a weight decay of 5 10 4 and a momentum of 0.9 for all experiments. The training and testing batch sizes for all datasets are set to 128.