Enabling Retrain-free Deep Neural Network Pruning Using Surrogate Lagrangian Relaxation
Authors: Deniz Gurevin, Mikhail Bragin, Caiwen Ding, Shanglin Zhou, Lynn Pepin, Bingbing Li, Fei Miao
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed method on image classification tasks using CIFAR-10 and Image Net, as well as object detection tasks using COCO 2014 and Ultra-Fast-Lane-Detection using Tu Simple lane detection dataset. Experimental results demonstrate that our SLR-based weight-pruning optimization approach achieves higher compression rate than state-of-the-arts under the same accuracy requirement. |
| Researcher Affiliation | Academia | 1 Department of Electrical and Computer Engineering, University of Connecticut, USA 2 Department of Computer Science and Engineering, University of Connecticut, USA {deniz.gurevin, mikhail.bragin, caiwen.ding, shanglin.zhou, lynn.pepin, bingbing.li, fei.miao}@uconn.edu |
| Pseudocode | Yes | Algorithm 1 Surrogate Lagrangian Relaxation |
| Open Source Code | No | The paper states 'All of the baseline models we use and our code in image classification tasks are implemented with Py Torch 1.6.0 and Python 3.6.' and mentions using publicly available repositories for YOLOv3 and TuSimple benchmark. However, it does not explicitly state that *their* code (for the SLR method) is open-source or provide a link to it. |
| Open Datasets | Yes | We evaluate the proposed method on image classification tasks using CIFAR-10 and Image Net, as well as object detection tasks using COCO 2014 and Ultra-Fast-Lane-Detection using Tu Simple lane detection dataset. |
| Dataset Splits | No | The paper discusses 'training' and 'testing' but does not explicitly provide details about a 'validation' dataset split or its use in the experimental setup. |
| Hardware Specification | Yes | We conducted our experiments on Ubuntu 18.04 and using Nvidia Quadro RTX 6000 GPU with 24 GB GPU memory. We used 4 GPU nodes to train our models on the Image Net dataset. |
| Software Dependencies | Yes | All of the baseline models we use and our code in image classification tasks are implemented with Py Torch 1.6.0 and Python 3.6. For our experiments on COCO 2014 dataset, we used Torch v1.6.0, pycocotools v2.0 packages. For our experiments on Tu Simple lane detection benchmark dataset2, we used Python 3.7 with Torch v1.6.0, and Sp Conv v1.2 package. |
| Experiment Setup | Yes | Training Settings. In all experiments we used ρ = 0.1. In CIFAR-10 experiments, we used a learning rate of 0.01, batch size of 128 and ADAM optimizer during training. On Image Net, we used a learning rate of 10 4, batch size of 256 and SGD optimizer. For a fair comparison of SLR and ADMM methods, we used the same number of training epochs and sparsity configuration for both methods in the experiments. Here, SLR parameters are set as M = 300, r = 0.1 and s0 = 10 2. |