Dynamic Network Pruning with Interpretable Layerwise Channel Selection
Authors: Yulong Wang, Xiaolu Zhang, Xiaolin Hu, Bo Zhang, Hang Su6299-6306
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our dynamic network achieves higher prediction accuracy under the similar computing budgets on CIFAR10 and Image Net datasets compared to traditional static pruning methods and other dynamic pruning approaches. |
| Researcher Affiliation | Collaboration | 1Tsinghua University, 2Ant Financial |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statement or link regarding the public availability of its source code. |
| Open Datasets | Yes | We conduct extensive experiments on CIFAR10, CIFAR100, SVHN and Image Net datasets. |
| Dataset Splits | No | Only 10% of randomly chosen test samples are used for training the binary classifiers and rest for evaluation. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU models, CPU types, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions optimizers (Adam, SGD) and libraries (UMAP) but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For CIFAR10 models, we train the full models for 160 epochs using a batch-size of 128 with SGD optimizer. The initial learning rate 0.1 is divided at 50% and 75% of the total number of training epochs. We use a momentum of 0.9 with weight decay of 10 4. For the CIFAR10 experiments, we choose m = 5 actions for each decision unit... The target sparsity ratio r is 0.1 for VGG16-BN and 0.4 for Res Net56. Balance factor γ = 1.0, the learning rate is 0.01, training batch size is 128, and the total training epoch is 100. |