OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization

Authors: Peng Hu, Xi Peng, Hongyuan Zhu, Mohamed M. Sabry Aly, Jie Lin7780-7788

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on Image Net with Alex Net/Mobile Net-V1/Res Net-50 show that our method improves accuracy and training efficiency while obtains significantly higher compression rates compared to the state-of-the-art.
Researcher Affiliation Academia 1Institute for Infocomm Research, Agency for Science, Technology and Research, Singapore 2College of Computer Science, Sichuan University, Chengdu 610065, China 3Nanyang Technological University, Singapore
Pseudocode Yes Algorithm 1 Optimization process of our method Input: A pre-trained FP32 model with L layers, objective pruning rate p , objective quantization bitwidth B, batch size Nb, and maximum epoch number Ne. Output: Finetuned compressed model. 1: Compute the pruning masks {Mi}L i=1 for all layers (see Section 3.2). 2: Calculate the qunatization steps { i}L i for all layer (see Section 3.3). 3: for 1, 2, , Ne do 4: repeat 5: Randomly sample a minbatch from the training set. 6: Compress the weights using { i}L i and {Mi}L i=1 for the model. 7: Forward propagate with the pruned and quantized weights, and compute the cross entropy loss. 8: Update the model weights with descending their stochastic gradient. 9: until all samples selected 10: end for
Open Source Code No The paper does not provide a direct link to open-source code or explicitly state that the code will be made available.
Open Datasets Yes All experiments are performed on Image Net (i.e., ILSVRC-2012) (Deng et al. 2009), a large-scale image classification dataset consisted of 1.2M training images and 50K validation images.
Dataset Splits Yes All experiments are performed on Image Net (i.e., ILSVRC-2012) (Deng et al. 2009), a large-scale image classification dataset consisted of 1.2M training images and 50K validation images.
Hardware Specification No The paper does not specify the hardware used for experiments.
Software Dependencies No The proposed method is implemented by Py Torch. However, no specific version number for PyTorch or other software dependencies is provided.
Experiment Setup Yes We set the batch size as 256 for all models at finetuning stage. The SGD otpimizer is utilized to finetune the compressed models with the momentum (= 0.9), weight decay (= 10 4), and learning rate (= 0.005).