Calibrating Deep Neural Networks using Focal Loss

Authors: Jishnu Mukhoti, Viveka Kulharia, Amartya Sanyal, Stuart Golodetz, Philip Torr, Puneet Dokania

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform extensive experiments on a variety of computer vision and NLP datasets, and with a wide variety of network architectures, and show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
Researcher Affiliation Collaboration Jishnu Mukhoti University of Oxford Five AI Ltd. Viveka Kulharia University of Oxford Amartya Sanyal University of Oxford The Alan Turing Institute Stuart Golodetz Five AI Ltd. Philip H. S. Torr University of Oxford Five AI Ltd. Puneet K. Dokania University of Oxford Five AI Ltd.
Pseudocode No The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/torrvision/ focal_calibration.
Open Datasets Yes We conduct image and document classification experiments to test the performance of focal loss. For the former, we use CIFAR-10/100 [13] and Tiny-Image Net [6] , and train Res Net-50, Res Net-110 [8], Wide-Res Net-26-10 [42] and Dense Net-121 [10] models, and for the latter, we use 20 Newsgroups [17] and Stanford Sentiment Treebank (SST) [32] datasets and train Global Pooling CNN [18] and Tree-LSTM [33] models.
Dataset Splits Yes Temperature Scaling: In order to compute the optimal temperature, we use two different methods: (a) learning the temperature by minimising val set NLL, and (b) performing grid search over tempera- tures between 0 and 10, with a step of 0.1, and finding the one that minimises val set ECE. ... For fair comparison, we chose 3 intermediate models for each loss function with the best val set ECE, NLL and accuracy...
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1'). It only vaguely refers to 'PyTorch' in a reference but without specifying the version for the work's direct dependencies.
Experiment Setup Yes For the analysis, we train a Res Net-50 network on CIFAR-10 with state-of-the-art performance settings [31]. We use Stochastic Gradient Descent (SGD) with a mini-batch of size 128, momentum of 0.9, and learning rate schedule of {0.1, 0.01, 0.001} for the first 150, next 100, and last 100 epochs, respectively.