AutoPrune: Automatic Network Pruning by Regularizing Auxiliary Parameters
Authors: XIA XIAO, Zigeng Wang, Sanguthevar Rajasekaran
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method with Le Net and VGGlike on MNIST and CIFAR-10 datasets, and with Alex Net, Res Net and Mobile Net on Image Net to establish the scalability of our work. Results show that our model achieves state-of-the-art sparsity, e.g. 7%, 23% FLOPs and 310x, 75x compression ratio for Le Net5 and VGG-like structure without accuracy drop, and 200M and 100M FLOPs for Mobile Net V2 with accuracy 73.32% and 66.83% respectively. |
| Researcher Affiliation | Academia | Xia Xiao, Zigeng Wang, Sanguthevar Rajasekaran Department of Computer Science and Engineering University of Connecticut Storrs, CT, USA, 06269 {xia.xiao, zigeng.wang, sanguthevar.rajasekaran}@uconn.edu |
| Pseudocode | Yes | Algorithm 1 Auto Prun |
| Open Source Code | No | The paper does not provide concrete access to source code through a specific repository link, explicit code release statement, or code in supplementary materials. |
| Open Datasets | Yes | We evaluate our method with Le Net and VGGlike on MNIST and CIFAR-10 datasets, and with Alex Net, Res Net and Mobile Net on Image Net to establish the scalability of our work. |
| Dataset Splits | Yes | The training set will be split into Xtrain and Xval... We split the training data into 1:1 for weight update and auxiliary parameter update respectively. |
| Hardware Specification | Yes | Our models are implemented by Tensorflow and run on Ubuntu Linux 16.04 with 32G memory and a single NVIDIA Titan Xp GPU. |
| Software Dependencies | No | The paper mentions "Tensorflow" but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | To show the insensitivity of the introduced hyperparameter, we set the learning rate of auxiliary parameters to 1.5e-2 and µ to 5e-2 for all test cases. In this structure, we use L2-norm and L1-norm for L1 with hyperparameters 5e-5 and 1e-6, respectively. Res Net-50 is trained with a learning rate schedule from 1e-5 to 1e-6. The learning rate for Alex Net is 1e-3 and for Mobile Net V2 is 1e-5. |