A Gradient Flow Framework For Analyzing Network Pruning

Authors: Ekdeep Singh Lubana, Robert P. Dick

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our claims on several VGG-13, Mobile Net-V1, and Res Net-56 models trained on CIFAR-10/CIFAR-100. and The results are shown in Table 1. Magnitude-based pruning consistently performs better than loss-preservation based pruning.
Researcher Affiliation Academia Ekdeep Singh Lubana & Robert P. Dick EECS Department University of Michigan Ann Arbor, MI 48105, USA {eslubana, dickrp}@umich.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks. It provides mathematical equations like Equation 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22, 23, 24, but these are not pseudocode.
Open Source Code No The paper does not contain any explicit statement about releasing source code or provide links to a code repository for the methodology described.
Open Datasets Yes We validate our claims on several VGG-13, Mobile Net-V1, and Res Net-56 models trained on CIFAR-10/CIFAR-100. and We train VGG-13, Mobile Net-V1, and Res Net-56 models on CIFAR-100. Also, in Appendix I: To further verify our claims, we repeat some of our experiments on Tiny-Image Net and confirm that our claims indeed generalize to the same.
Dataset Splits No Train/test accuracy curves for pruned Res Net-56 models on CIFAR-10 (left) and CIFAR100 (right) over 25 rounds. and Furthermore, train/test convergence for magnitude-based pruned models is faster than that for loss-preservation based pruned models, as shown in Figure 1. The paper refers to training and testing but does not explicitly specify a validation split or its details.
Hardware Specification No The paper does not specify any particular hardware used for running the experiments. It only mentions the general use of 'intelligent edge systems' for DNNs, which is not a hardware specification for the experiments themselves.
Software Dependencies No The paper mentions 'SGD (stochastic gradient descent) or its variants' and refers to concepts like 'softmax for classification', but it does not provide specific version numbers for any software dependencies like programming languages, libraries, or frameworks.
Experiment Setup Yes Optimizer: SGD, Momentum: 0.9, Weight decay: 0.0001, Learning rate schedule: (0.1, 0.01, 0.001), Number of epochs for each learning rate: (80, 40, 40), Batch Size: 128. and Number of epochs for each learning rate: (80 number of pruning rounds, 40, 40). It also specifies pruning ratios for different models and datasets.