LEARNED STEP SIZE QUANTIZATION

Authors: Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar Appuswamy, Dharmendra S. Modha

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here, we present a method for training such networks, Learned Step Size Quantization, that achieves the highest accuracy to date on the Image Net dataset when using models, from a variety of architectures, with weights and activations quantized to 2-, 3or 4-bits of precision, and that can train 3-bit models that reach full precision baseline accuracy. Table 1: Comparison of low precision networks on Image Net.
Researcher Affiliation Industry Steven K. Esser , Jeffrey L. Mc Kinstry, Deepika Bablani, Rathinakumar Appuswamy, Dharmendra S. Modha IBM Research San Jose, California, USA
Pseudocode Yes In this section we provide pseudocode to facilitate the implementation of LSQ.
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing the code for the work described in this paper, nor does it provide a direct link to a source-code repository.
Open Datasets Yes All experiments were conducted on the Image Net dataset (Russakovsky et al., 2015)
Dataset Splits Yes Images were resized to 256 256, then a 224 224 crop was selected for training, with horizontal mirroring applied half the time. At test time, a 224 224 centered crop was chosen.
Hardware Specification No The paper does not explicitly describe the specific hardware used to run its experiments (e.g., specific GPU/CPU models or processor details).
Software Dependencies No The paper mentions 'Py Torch' but does not provide specific version numbers for key software components or libraries.
Experiment Setup Yes Networks were trained with a momentum of 0.9, using a softmax cross entropy loss function, and cosine learning rate decay without restarts (Loshchilov & Hutter, 2016). ... The initial learning rate was set to 0.1 for full precision networks, 0.01 for 2-, 3-, and 4-bit networks and to 0.001 for 8-bit networks.