reproducibilityindex.ai

Fix your classifier: the marginal value of training the last weight layer

Authors: Elad Hoffer, Itay Hubara, Daniel Soudry

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3 EXPERIMENTAL RESULTS Table 1: Validation accuracy results on learned vs. ﬁxed classiﬁer We trained a residual network of He et al. (2016) on the Cifar10 dataset.
Researcher Affiliation	Academia	Elad Hoffer, Itay Hubara, Daniel Soudry Department of Electrical Engineering Technion Haifa, 320003, Israel elad.hoffer, itay.hubara, daniel.soudry@gmail.com
Pseudocode	No	The information is insufficient. The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code	Yes	Table 1 summarizes our ﬁxed-classiﬁer results on convolutional networks, comparing to originally reported results. We offer our drop-in replacement for learned classiﬁer that can be used to train models with ﬁxed classiﬁers and replicate our results1. 1Code is available at https://github.com/eladhoffer/fix_your_classifier
Open Datasets	Yes	We used the well known Cifar10 and Cifar100 datasets by Krizhevsky (2009) as an initial test-bed to explore the idea of a ﬁxed classiﬁer. In order to validate our results on a more challenging dataset, we used the Imagenet dataset introduced by Deng et al. (2009).
Dataset Splits	Yes	Cifar10 is an image classiﬁcation benchmark dataset containing 50, 000 training images and 10, 000 test images. The results shown in ﬁgure 2 demonstrate that although the training error is considerably lower for the network with learned classiﬁer, both models achieve the same classiﬁcation accuracy on the validation set.
Hardware Specification	No	The information is insufficient. The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The information is insufficient. The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	We used a network of depth 56 and the same hyper-parameters used in the original work. We compared two variants: the original model with a learned classiﬁer, and our version, where a ﬁxed transformation is used. In all experiments the α scale parameter was regularized with the same weight decay coefﬁcient used on original classiﬁer.