Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
LEARNED STEP SIZE QUANTIZATION
Authors: Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar Appuswamy, Dharmendra S. Modha
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here, we present a method for training such networks, Learned Step Size Quantization, that achieves the highest accuracy to date on the Image Net dataset when using models, from a variety of architectures, with weights and activations quantized to 2-, 3or 4-bits of precision, and that can train 3-bit models that reach full precision baseline accuracy. Table 1: Comparison of low precision networks on Image Net. |
| Researcher Affiliation | Industry | Steven K. Esser , Jeffrey L. Mc Kinstry, Deepika Bablani, Rathinakumar Appuswamy, Dharmendra S. Modha IBM Research San Jose, California, USA |
| Pseudocode | Yes | In this section we provide pseudocode to facilitate the implementation of LSQ. |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing the code for the work described in this paper, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | All experiments were conducted on the Image Net dataset (Russakovsky et al., 2015) |
| Dataset Splits | Yes | Images were resized to 256 256, then a 224 224 crop was selected for training, with horizontal mirroring applied half the time. At test time, a 224 224 centered crop was chosen. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used to run its experiments (e.g., specific GPU/CPU models or processor details). |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not provide specific version numbers for key software components or libraries. |
| Experiment Setup | Yes | Networks were trained with a momentum of 0.9, using a softmax cross entropy loss function, and cosine learning rate decay without restarts (Loshchilov & Hutter, 2016). ... The initial learning rate was set to 0.1 for full precision networks, 0.01 for 2-, 3-, and 4-bit networks and to 0.001 for 8-bit networks. |