Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Fix your classifier: the marginal value of training the last weight layer
Authors: Elad Hoffer, Itay Hubara, Daniel Soudry
ICLR 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 EXPERIMENTAL RESULTS Table 1: Validation accuracy results on learned vs. fixed classifier We trained a residual network of He et al. (2016) on the Cifar10 dataset. |
| Researcher Affiliation | Academia | Elad Hoffer, Itay Hubara, Daniel Soudry Department of Electrical Engineering Technion Haifa, 320003, Israel elad.hoffer, itay.hubara, EMAIL |
| Pseudocode | No | The information is insufficient. The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | Yes | Table 1 summarizes our fixed-classifier results on convolutional networks, comparing to originally reported results. We offer our drop-in replacement for learned classifier that can be used to train models with fixed classifiers and replicate our results1. 1Code is available at https://github.com/eladhoffer/fix_your_classifier |
| Open Datasets | Yes | We used the well known Cifar10 and Cifar100 datasets by Krizhevsky (2009) as an initial test-bed to explore the idea of a fixed classifier. In order to validate our results on a more challenging dataset, we used the Imagenet dataset introduced by Deng et al. (2009). |
| Dataset Splits | Yes | Cifar10 is an image classification benchmark dataset containing 50, 000 training images and 10, 000 test images. The results shown in figure 2 demonstrate that although the training error is considerably lower for the network with learned classifier, both models achieve the same classification accuracy on the validation set. |
| Hardware Specification | No | The information is insufficient. The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The information is insufficient. The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We used a network of depth 56 and the same hyper-parameters used in the original work. We compared two variants: the original model with a learned classifier, and our version, where a fixed transformation is used. In all experiments the α scale parameter was regularized with the same weight decay coefficient used on original classifier. |