reproducibilityindex.ai

And the Bit Goes Down: Revisiting the Quantization of Neural Networks

Authors: Pierre Stock, Armand Joulin, Rémi Gribonval, Benjamin Graham, Hervé Jégou

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our approach by quantizing a high performing Res Net-50 model to a memory size of 5 MB (20 compression factor) while preserving a top-1 accuracy of 76.1% on Image Net object classiﬁcation and by compressing a Mask R-CNN with a 26 factor.1
Researcher Affiliation	Collaboration	1Facebook AI Research, 2Univ Rennes, Inria, CNRS, IRISA
Pseudocode	No	The paper describes the steps of its algorithm (E-step, M-step) in paragraph form and as bullet points but does not present them in a structured pseudocode or algorithm block.
Open Source Code	Yes	Code and compressed models: https://github.com/facebookresearch/kill-the-bits.
Open Datasets	Yes	We quantize vanilla Res Net-18 and Res Net-50 architectures pretrained on the Image Net dataset (Deng et al., 2009). Unless explicit mention of the contrary, the pretrained models are taken from the Py Torch model zoo3. ... In particular, Yalniz et al. (Yalniz et al., 2019) use the publicly available YFCC-100M dataset (Thomee et al., 2015) to train a Res Net-50 that reaches 79.1% top-1 accuracy on the standard validation set of Image Net.
Dataset Splits	Yes	We quantize vanilla Res Net-18 and Res Net-50 architectures pretrained on the Image Net dataset (Deng et al., 2009). ... The accuracy is the top-1 error on the standard validation set of Image Net. ... We perform the global ﬁnetuning using the standard Image Net training set for 9 epochs with an initial learning rate of 0.01, a weight decay of 10 4 and a momentum of 0.9. The learning rate is decayed by a factor 10 every 3 epochs.
Hardware Specification	Yes	We run our method on a 16 GB Volta V100 GPU. Quantizing a Res Net50 with our method (including all ﬁnetuning steps) takes about one day on 1 GPU. ... We perform the ﬁne-tuning (layer-wise and global) using distributed training on 8 V100 GPUs.
Software Dependencies	No	Unless explicit mention of the contrary, the pretrained models are taken from the Py Torch model zoo3. The paper mentions PyTorch but does not specify a version number or other software dependencies with version numbers.
Experiment Setup	Yes	We quantize each layer while performing 100 steps of our method (sufﬁcient for convergence in practice). We ﬁnetune the centroids of each layer on the standard Image Net training set during 2,500 iterations with a batch size of 128 (resp 64) for the Res Net-18 (resp.Res Net50) with a learning rate of 0.01, a weight decay of 10 4 and a momentum of 0.9. For accuracy and memory reasons, the classiﬁer is always quantized with a block size d = 4 and k = 2048 (resp. k = 1024) centroids for the Res Net-18 (resp., Res Net-50).