Focused Quantization for Sparse CNNs
Authors: Yiren Zhao, Xitong Gao, Daniel Bates, Robert Mullins, Cheng-Zhong Xu
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Res Net-50, we achieved a 18.08 CR with only 0.24% loss in top-5 accuracy, outperforming existing compression methods. We fully compressed a Res Net-18 and found that it is not only higher in CR and top-5 accuracy, but also more hardware efficient as it requires fewer logic gates to implement when compared to other state-of-the-art quantization methods assuming the same throughput. |
| Researcher Affiliation | Collaboration | 1 University of Cambridge 2 Shenzhen Institutes of Advanced Technology 3 University of Macau |
| Pseudocode | No | The paper includes diagrams (e.g., Figure 2, Figure 4) illustrating the process or architecture, but it does not contain formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Finally, FQ and the optimized models are fully open-source and released to the public3. 3Available at: https://github.com/deep-fry/mayo. |
| Open Datasets | Yes | We applied focused compression (FC)... on the Image Net dataset [2]... This model is a fast CIFAR-10 [13] classifier... |
| Dataset Splits | Yes | We applied focused compression (FC)... on the Image Net dataset [2]... This model is a fast CIFAR-10 [13] classifier... At each step, the models were fine-tuned for 3 epochs at a learning rate of 0.001, except for the final step at 100% we ran for 10 epochs, and decay the learning rate every 3 epochs. |
| Hardware Specification | No | The paper discusses 'custom hardware accelerators' and provides 'Computation resource estimates of custom accelerators' in terms of logic gates (Table 4), but it does not specify the commercial hardware (e.g., specific GPUs, CPUs) used to run the reported experiments. |
| Software Dependencies | No | The paper mentions techniques and tools used (e.g., 'Dynamic Network Surgery', 'Incremental Network Quantization', 'Huffman encoding') but does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | Yes | During fine-tuning, we additionally employed Incremental Network Quantization (INQ) [26] and gradually increased the proportion of weights being quantized to 25%, 50%, 75%, 87.5% and 100%. At each step, the models were fine-tuned for 3 epochs at a learning rate of 0.001, except for the final step at 100% we ran for 10 epochs, and decay the learning rate every 3 epochs. |