Post training 4-bit quantization of convolutional networks for rapid-deployment

Authors: Ron Banner, Yury Nahshan, Daniel Soudry

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Combining these methods, our approach achieves accuracy that is just a few percents less the state-of-the-art baseline across a wide range of convolutional models. The source code to replicate all experiments is available on Git Hub: https://github.com/submission2019/cnn-quantization. This section reports experiments on post-training quantization using six convolutional models originally pre-trained on the Image Net dataset.
Researcher Affiliation Collaboration Intel Artificial Intelligence Products Group (AIPG)1 Technion Israel Institute of Technology2
Pseudocode No The paper contains mathematical formulations and derivations, but no explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The source code to replicate all experiments is available on Git Hub: https://github.com/submission2019/cnn-quantization.
Open Datasets Yes This section reports experiments on post-training quantization using six convolutional models originally pre-trained on the Image Net dataset.
Dataset Splits Yes Table 1: Image Net Top-1 validation accuracy with post-training quantization using the three methods suggested by this work.
Hardware Specification No The paper does not specify the hardware used for its experiments.
Software Dependencies No The paper mentions GEMMLOWP but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes This section reports experiments on post-training quantization using six convolutional models originally pre-trained on the Image Net dataset. We consider the following baseline setup: Per-channel-quantization of weights and activations: ... Fused Re LU: ... We use the common practice to quantize the first and the last layer as well as average/max-pooling layers to 8-bit precision.