reproducibilityindex.ai

Dual Focal Loss for Calibration

Authors: Linwei Tao, Minjing Dong, Chang Xu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide theoretical evidence to support our approach and demonstrate its effectiveness through evaluations on multiple models and datasets, where it achieves state-of-the-art performance. Code is available at https://github.com/Linwei94/Dual Focal Loss
Researcher Affiliation	Academia	1School of Computing Science, University of Sydney. Correspondence to: Linwei Tao <linwei.tao@sydney.edu.au>, Chang Xu <c.xu@sydney.edu.au>.
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/Linwei94/Dual Focal Loss
Open Datasets	Yes	Our experiments are conducted on CIFAR-10/100 (Krizhevsky et al., 2009) and Tiny Image Net (Deng et al., 2009) for calibration performance. The SVHN dataset, a dataset of street view house numbers and the CIFAR-10-C dataset, a corrupted version of the CIFAR-10 dataset are used as Out-of-Distribution (Oo D) datasets for evaluating the robustness of models.
Dataset Splits	Yes	We train CIFAR-10/100 for 350 epochs, using 5000 images from the training set for validation.
Hardware Specification	Yes	We conduct all experiments on a single Tesla V-100 GPU with all random seeds set to 1.
Software Dependencies	No	The paper mentions implementing the algorithm and using SGD, but it does not specify version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	We train CIFAR-10/100 for 350 epochs, using 5000 images from the training set for validation. The learning rate is set to 0.1 for the first 150 epochs, 0.01 for the following 100 epochs, and 0.001 for the remaining epochs. For Tiny-Image Net, we train for 100 epochs, with the learning rate set to 0.1 for the first 40 epochs, 0.01 for the following 20 epochs, and 0.001 for the remaining epochs. We use SGD with a weight decay of 5 10 4 and a momentum of 0.9 for all experiments. The training and testing batch sizes for all datasets are set to 128.