CHIP: CHannel Independence-based Pruning for Compact Neural Networks

Authors: Yang Sui, Miao Yin, Yi Xie, Huy Phan, Saman Aliari Zonouz, Bo Yuan

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our evaluation results for different models on various datasets show the superior performance of our approach. Notably, on CIFAR-10 dataset our solution can bring 0.90% and 0.94% accuracy increase over baseline Res Net-56 and Res Net-110 models, respectively, and meanwhile the model size and FLOPs are reduced by 42.8% and 47.4% (for Res Net-56) and 48.3% and 52.1% (for Res Net-110), respectively. On Image Net dataset, our approach can achieve 40.8% and 44.8% storage and computation reductions, respectively, with 0.15% accuracy increase over the baseline Res Net-50 model.
Researcher Affiliation Academia Yang Sui Miao Yin Yi Xie Huy Phan Saman Zonouz Bo Yuan Department of Electrical and Computer Engineering Rutgers University Piscataway, NJ 08854, USA {yang.sui, miao.yin, yi.xie, huy.phan, saman.zonouz}@rutgers.edu, bo.yuan@soe.rutgers.edu
Pseudocode Yes Algorithm 1 CHannel Independence-based Pruning (CHIP) procedure for the l-th layer Input: Pre-trained weight tensor Wl, N sets of feature maps Al = {Al 1, Al 2, , Al cl} Rcl h w from N input samples, and the desired number of filters to be preserved κl. Output: Pruned weight tensor Wl prune. 1: for each input sample do 2: Flatten feature maps: Al := reshape(Al, [cl, hw]); 3: for i = 1 to cl do 4: CI calculation: Calculate CI(Al i) via Equation 3; 5: end for 6: end for 7: Averaging: Average CI(Al i) under all N input samples; 8: Sorting: Sort {CI(Al i)}cl i=1 in ascending order; 9: Pruning: Prune cl κl filters in Wl corresponding to the cl κl smallest CI(Al i); 10: Fine-tuning: Obtain final Wl prune via fine-tuning Wl with removing the pruned filter channels.
Open Source Code Yes The code is available at https://github.com/Eclipsess/CHIP_NeurIPS2021.
Open Datasets Yes To be specific, we conduct experiments for three CNN models (Res Net-56, Res Net-110 and VGG-16) on CIFAR-10 dataset [24]. Also, we further evaluate our approach and compare its performance with other state-of-the-art pruning methods for Res Net-50 model on large-scale Image Net dataset [5].
Dataset Splits No The paper mentions fine-tuning on CIFAR-10 and ImageNet datasets with specific batch sizes and epochs, but does not explicitly state the train/validation/test dataset splits (e.g., percentages or sample counts) used for reproducibility, nor does it explicitly confirm using standard validation splits.
Hardware Specification Yes We conduct our empirical evaluations on Nvidia Tesla V100 GPUs with Py Torch 1.7 framework.
Software Dependencies Yes We conduct our empirical evaluations on Nvidia Tesla V100 GPUs with Py Torch 1.7 framework.
Experiment Setup Yes To be specific, we perform the fine-tuning for 300 epochs on CIFAR-10 datasets with the batch size, momentum, weight decay and initial learning rate as 128, 0.9, 0.05 and 0.01, respectively. On the Image Net dataset, fine-tuning is performed for 180 epochs with the batch size, momentum, weight decay and initial learning rate as 256, 0.99, 0.0001 and 0.1, respectively.