Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Authors: Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Chris De Sa, Zhiru Zhang
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluation on Image Net classification and language modeling shows that OCS can outperform state-of-the-art clipping techniques with only minor overhead. |
| Researcher Affiliation | Academia | Ritchie Zhao 1 Yuwei Hu 1 Jordan Dotzel 1 Christopher De Sa 1 Zhiru Zhang 1 1Cornell University, Ithaca, New York 14850, USA. |
| Pseudocode | No | The paper describes the method using mathematical equations and diagrams, but does not provide pseudocode or a formally labeled algorithm block. |
| Open Source Code | Yes | Code for both OCS and clipping is available in open source 1. https://github.com/cornell-zhang/dnn-quant-ocs |
| Open Datasets | Yes | Experimental evaluation on Image Net classification and language modeling shows that OCS can outperform state-of-the-art clipping techniques with only minor overhead. ... Deep convolutional networks dominate the leaderboards for popular image classification and object detection datasets such as Image Net (Deng et al., 2009) and Microsoft COCO (Lin et al., 2014). ... The corpus is the Wiki Text-2 dataset (Merity et al., 2016) with a vocabulary of 33,278 words. |
| Dataset Splits | Yes | Table 2. Image Net Top-1 validation accuracy with weight quantization... For activation quantization, we first sampled the activation distributions using 512 training images (i.e. images not part of the validation/test set) to determine the quantization grid points, then use this grid during testing. |
| Hardware Specification | Yes | This profiling took between 40 and 200 seconds on our machine using an NVIDIA GTX 1080 Ti. ... One of the Titan Xp GPUs used for this research was donated by NVIDIA. |
| Software Dependencies | No | This section reports experiments on CNN models for Image Net classification (Deng et al., 2009) conducted using Py Torch (Paszke et al., 2017) and Intel s open-source Distiller 3 quantization library. |
| Experiment Setup | Yes | For a layer containing C channels, OCS splits ceil(r C) channels, where r is the expansion ratio, a hyperparameter that determines approximately the level of tolerable overhead in the network. ... The first layer was not quantized as it generally requires more bits than the others, and contains only 3 input channels meaning OCS would incur a large overhead. |