reproducibilityindex.ai

AdaFocal: Calibration-aware Adaptive Focal Loss

Authors: Arindam Ghosh, Thomas Schaaf, Matthew Gormley

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Ada Focal on various image recognition and one NLP task, covering a wide variety of network architectures, to confirm the improvement in calibration while achieving similar levels of accuracy. Additionally, we show that models trained with Ada Focal achieve a significant boost in out-of-distribution detection.
Researcher Affiliation	Collaboration	Arindam Ghosh 3M Health Info. Systems Pittsburgh, PA 15217 aghosh4@mmm.com Thomas Schaaf 3M Health Info. Systems Pittsburgh, PA 15217 tschaaf@mmm.com Matt Gormley Carnegie Mellon University Pittsburgh, PA 15213 mgormley@cs.cmu.edu
Pseudocode	Yes	Algorithm 1: Ada Focal
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] As part of the supplementary material and details are mentioned in Appendix D.
Open Datasets	Yes	We evaluate the performance of our proposed method on image and text classification tasks. For image classification, we use CIFAR-10, CIFAR-100 [9], Tiny-Image Net [2], and Image Net [27]... For text classification, we use the 20 Newsgroup dataset [14].
Dataset Splits	Yes	We further assume access to a validation set for hyper-parameter tuning and a test set for evaluation. We experimented with Ada Focal using 5, 10, 15, 20, 30, and 50 equal-mass bins during training to draw calibration statistics form the validation set. Therefore, we use 15 bins for all Ada Focal trainings.
Hardware Specification	No	The main paper does not provide specific hardware details (e.g., GPU models, CPU types, or memory) used for the experiments. It states in the checklist that this information is in Appendix D, but Appendix D content is not provided within the given text.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, TensorFlow, specific libraries or compilers). It mentions general model types like CNN and BERT but not the software stack used to implement and run them.
Experiment Setup	Yes	If not stated explicitly, we use Sth = 0.2 for all Ada Focal experiments. λ is redundant and one may choose to ignore it as for all our experiments λ = 1 worked very well. For all our experiments, we use γmax = 20... γmin = 2 is selected... Therefore, we use 15 bins for all Ada Focal trainings.