Improving Neural Network Quantization without Retraining using Outlier Channel Splitting

Authors: Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Chris De Sa, Zhiru Zhang

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental evaluation on Image Net classification and language modeling shows that OCS can outperform state-of-the-art clipping techniques with only minor overhead.
Researcher Affiliation Academia Ritchie Zhao 1 Yuwei Hu 1 Jordan Dotzel 1 Christopher De Sa 1 Zhiru Zhang 1 1Cornell University, Ithaca, New York 14850, USA.
Pseudocode No The paper describes the method using mathematical equations and diagrams, but does not provide pseudocode or a formally labeled algorithm block.
Open Source Code Yes Code for both OCS and clipping is available in open source 1. https://github.com/cornell-zhang/dnn-quant-ocs
Open Datasets Yes Experimental evaluation on Image Net classification and language modeling shows that OCS can outperform state-of-the-art clipping techniques with only minor overhead. ... Deep convolutional networks dominate the leaderboards for popular image classification and object detection datasets such as Image Net (Deng et al., 2009) and Microsoft COCO (Lin et al., 2014). ... The corpus is the Wiki Text-2 dataset (Merity et al., 2016) with a vocabulary of 33,278 words.
Dataset Splits Yes Table 2. Image Net Top-1 validation accuracy with weight quantization... For activation quantization, we first sampled the activation distributions using 512 training images (i.e. images not part of the validation/test set) to determine the quantization grid points, then use this grid during testing.
Hardware Specification Yes This profiling took between 40 and 200 seconds on our machine using an NVIDIA GTX 1080 Ti. ... One of the Titan Xp GPUs used for this research was donated by NVIDIA.
Software Dependencies No This section reports experiments on CNN models for Image Net classification (Deng et al., 2009) conducted using Py Torch (Paszke et al., 2017) and Intel s open-source Distiller 3 quantization library.
Experiment Setup Yes For a layer containing C channels, OCS splits ceil(r C) channels, where r is the expansion ratio, a hyperparameter that determines approximately the level of tolerable overhead in the network. ... The first layer was not quantized as it generally requires more bits than the others, and contains only 3 input channels meaning OCS would incur a large overhead.