reproducibilityindex.ai

Revisit Kernel Pruning with Lottery Regulated Grouped Convolutions

Authors: Shaochen Zhong, Guanqun Zhang, Ningjia Huang, Shuai Xu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments also demonstrate our method often outperforms comparable SOTA methods with lesser data augmentation needed, smaller ﬁnetuning budget required, and sometimes even much simpler procedure executed (e.g., one-shot v. iterative). 4 EXPERIMENTS AND RESULTS
Researcher Affiliation	Academia	Shaochen (Henry) Zhong1, Guanqun Zhang2, Ningjia Huang1, and Shuai Xu1 1Department of Computer and Data Sciences, Case Western Reserve University 1{sxz517, nxh239, sxx214}@case.edu 2Center for Combinatorics, Nankai University 2zhanggq1994@mail.nankai.edu.cn
Pseudocode	Yes	A.2.1 GREEDY GROUPED KERNEL PRUNING PROCEDURE Algorithm 1 Generate Cℓ in grouped kernel pruning strategies
Open Source Code	Yes	Please refer to our Git Hub repository for code. As we advocate our proposed framework is able to shine a new light on kernel pruning under the context of densely structured pruning, we have prepared a Git Hub repository with checkpoints placed on every stage of our method.
Open Datasets	Yes	For datasets, we choose CIFAR-10 (Krizhevsky, 2009), Tiny-Image Net (Wu et al., 2017), and Image Net (ILSVRC-12) (Deng et al., 2009)
Dataset Splits	No	The paper mentions training and testing on datasets like CIFAR-10 and ImageNet, but does not explicitly provide details about specific training/validation/test dataset splits (percentages or counts) or refer to a specific predefined validation split used.
Hardware Specification	Yes	The following experiments are conducted on a 2.00GHz 4 core Intel Xeon CPU and Tesla V100.
Software Dependencies	No	The paper mentions 'Py Torch' but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup	Yes	For all experiments done on CIFAR-10 and Tiny-Image Net, we train the baseline models for 300 epochs with the learning rate starting at 0.1 and dividing by 10 per every 100 epochs. The baseline model is trained using SGD with a weight-decay set to 5e-4, momentum set to 0.9, and a batch-size of 64. All data are augmented with random crop and randomly horizontal ﬂip. For the experiments done on Image Net, we train the Res Net-50 model for 90 epochs with the weight-decay set to 1e-4 and learning rate dividing by 10 per every 30 epochs (while keeping all other settings the same as CIFAR-10 and Tiny-Image Net experiments). Our pruning settings are largely identical to our training settings except for the learning rate, which is set to 0.01 at the start.