reproducibilityindex.ai

Calibrating Deep Neural Networks using Focal Loss

Authors: Jishnu Mukhoti, Viveka Kulharia, Amartya Sanyal, Stuart Golodetz, Philip Torr, Puneet Dokania

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive experiments on a variety of computer vision and NLP datasets, and with a wide variety of network architectures, and show that our approach achieves state-of-the-art calibration without compromising on accuracy in almost all cases.
Researcher Affiliation	Collaboration	Jishnu Mukhoti University of Oxford Five AI Ltd. Viveka Kulharia University of Oxford Amartya Sanyal University of Oxford The Alan Turing Institute Stuart Golodetz Five AI Ltd. Philip H. S. Torr University of Oxford Five AI Ltd. Puneet K. Dokania University of Oxford Five AI Ltd.
Pseudocode	No	The paper does not contain any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/torrvision/ focal_calibration.
Open Datasets	Yes	We conduct image and document classiﬁcation experiments to test the performance of focal loss. For the former, we use CIFAR-10/100 [13] and Tiny-Image Net [6] , and train Res Net-50, Res Net-110 [8], Wide-Res Net-26-10 [42] and Dense Net-121 [10] models, and for the latter, we use 20 Newsgroups [17] and Stanford Sentiment Treebank (SST) [32] datasets and train Global Pooling CNN [18] and Tree-LSTM [33] models.
Dataset Splits	Yes	Temperature Scaling: In order to compute the optimal temperature, we use two different methods: (a) learning the temperature by minimising val set NLL, and (b) performing grid search over tempera- tures between 0 and 10, with a step of 0.1, and ﬁnding the one that minimises val set ECE. ... For fair comparison, we chose 3 intermediate models for each loss function with the best val set ECE, NLL and accuracy...
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9, and CUDA 11.1'). It only vaguely refers to 'PyTorch' in a reference but without specifying the version for the work's direct dependencies.
Experiment Setup	Yes	For the analysis, we train a Res Net-50 network on CIFAR-10 with state-of-the-art performance settings [31]. We use Stochastic Gradient Descent (SGD) with a mini-batch of size 128, momentum of 0.9, and learning rate schedule of {0.1, 0.01, 0.001} for the ﬁrst 150, next 100, and last 100 epochs, respectively.