AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters

Authors: XIA XIAO, Zigeng Wang, Sanguthevar Rajasekaran

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method with Le Net and VGGlike on MNIST and CIFAR-10 datasets, and with Alex Net, Res Net and Mobile Net on Image Net to establish the scalability of our work. Results show that our model achieves state-of-the-art sparsity, e.g. 7%, 23% FLOPs and 310x, 75x compression ratio for Le Net5 and VGG-like structure without accuracy drop, and 200M and 100M FLOPs for Mobile Net V2 with accuracy 73.32% and 66.83% respectively.
Researcher Affiliation Academia Xia Xiao, Zigeng Wang, Sanguthevar Rajasekaran Department of Computer Science and Engineering University of Connecticut Storrs, CT, USA, 06269 {xia.xiao, zigeng.wang, sanguthevar.rajasekaran}@uconn.edu
Pseudocode Yes Algorithm 1 Auto Prun
Open Source Code No The paper does not provide concrete access to source code through a specific repository link, explicit code release statement, or code in supplementary materials.
Open Datasets Yes We evaluate our method with Le Net and VGGlike on MNIST and CIFAR-10 datasets, and with Alex Net, Res Net and Mobile Net on Image Net to establish the scalability of our work.
Dataset Splits Yes The training set will be split into Xtrain and Xval... We split the training data into 1:1 for weight update and auxiliary parameter update respectively.
Hardware Specification Yes Our models are implemented by Tensorflow and run on Ubuntu Linux 16.04 with 32G memory and a single NVIDIA Titan Xp GPU.
Software Dependencies No The paper mentions "Tensorflow" but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes To show the insensitivity of the introduced hyperparameter, we set the learning rate of auxiliary parameters to 1.5e-2 and µ to 5e-2 for all test cases. In this structure, we use L2-norm and L1-norm for L1 with hyperparameters 5e-5 and 1e-6, respectively. Res Net-50 is trained with a learning rate schedule from 1e-5 to 1e-6. The learning rate for Alex Net is 1e-3 and for Mobile Net V2 is 1e-5.