Rethinking the Pruning Criteria for Convolutional Neural Network

Authors: Zhongzhan Huang, Wenqi Shao, Xinjiang Wang, Liang Lin, Ping Luo

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental From our comprehensive experiments, we found two blind spots of pruning criteria: (1) Similarity: There are some strong similarities among several primary pruning criteria that are widely cited and compared. According to these criteria, the ranks of filters Importance Score are almost identical, resulting in similar pruned structures. (2) Applicability: The filters Importance Score measured by some pruning criteria are too close to distinguish the network redundancy well. In this paper, we analyze the above blind spots on different types of pruning criteria with layer-wise pruning or global pruning. We also break some stereotypes, such as that the results of ℓ1 and ℓ2 pruning are not always similar. These analyses are based on the empirical experiments and our assumption (Convolutional Weight Distribution Assumption) that the well-trained convolutional filters in each layer approximately follow a Gaussian-alike distribution. This assumption has been verified through systematic and extensive statistical tests.
Researcher Affiliation Collaboration Zhongzhan Huang1 Wenqi Shao2,3 Xinjiang Wang3 Liang Lin1 Ping Luo4 1Sun Yat-Sen University, 2The Chinese University of Hong Kong, 3Sense Time Research,4The University of Hong Kong
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about open-sourcing its code for the described methodology, nor does it include a link to a code repository.
Open Datasets Yes Taking VGG16 (3rd Conv) and Res Net18 (12th Conv) on Norm-based criteria as examples. The pruned filters index (the ranks of filters Importance Score) are almost the same, which lead to the similar pruned structures. ... Res Net56 on CIFAR10/100 ... Res Net-18 trained on Image Net dataset. ... trained on Cifar100. ... VGG16 (CIFAR10).
Dataset Splits No The paper mentions training on datasets like CIFAR10/100 and ImageNet, which have standard splits, and discusses 'test accuracy', but it does not explicitly state the train/validation/test split percentages or sample counts used for reproduction.
Hardware Specification No The acknowledgments section mentions 'Mingfu Liang for offering his selfpurchasing GPUs' but does not specify the model, quantity, or other details of the GPUs or any other hardware used for experiments.
Software Dependencies No The paper mentions 'torchvison model zoo [20]' but does not specify version numbers for PyTorch, torchvision, or any other software dependencies needed to replicate the experiments.
Experiment Setup No The paper discusses pruning ratios (e.g., 'prune 50% filters in all layers') and mentions 'one-shot method' and 'fine-tuning', but it does not provide specific details on hyper-parameters such as learning rates, optimizers, batch sizes, or the number of epochs used for training or fine-tuning, which are crucial for reproducing the experimental setup.