Network Pruning That Matters: A Case Study on Retraining Variants

Authors: Duong Hoang Le, Binh-Son Hua

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we conduct extensive experiments to verify and analyze the uncanny effectiveness of learning rate rewinding.
Researcher Affiliation Collaboration Duong H. Le Vin AI Research, Vietnam Binh-Son Hua Vin AI Research and Vin University, Vietnam
Pseudocode No The paper describes various retraining techniques and pruning algorithms but does not include any formal pseudocode or algorithm blocks.
Open Source Code No To facilitate reproducibility, we would release our implementation upon publication.
Open Datasets Yes For CIFAR-10 and CIFAR-100, we run each experiment three times and report mean std . For Image Net, we run each experiment once.
Dataset Splits Yes To make a fair comparison between fine-tuning and no fine-tuning, we randomly split the conventional training set of CIFAR-10/CIFAR-100 (including 50000 images) to train (90% images of total set) and val (10% remaining images) set and we then report the result of best-validation models on standard test set (including 5000 images).
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., GPU model, CPU type) used for running its experiments.
Software Dependencies No The paper mentions software like PyTorch and refers to implementations from other works, but it does not specify version numbers for these software components.
Experiment Setup Yes Table 6: Training configuration for unpruned models. To train CIFAR-10, we use Nesterov SGD with β = 0.9, batch size 64, weight decay 0.0001 for 160 epochs. To train Image Net, we use Nesterov SGD with β = 0.9, batch size 32, weight decay 0.0001 for 90 epochs.