reproducibilityindex.ai

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

Authors: Chaoqi Wang, Roger Grosse, Sanja Fidler, Guodong Zhang

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate empirically the effectiveness of the proposed method through extensive experiments. In particular, we highlight that the improvements are especially signiﬁcant for more challenging datasets and networks. With negligible loss of accuracy, an iterative-pruning version gives a 10 reduction in model size and a 8 reduction in FLOPs on wide Res Net32. Our code is available at here. 5. Experiments In this section, we aim to verify the effectiveness of Eigen Damage in reducing the test-time resource requirements of a network without signiﬁcantly sacriﬁcing accuracy. We compare Eigen Damage with other compression methods in terms of test accuracy, reduction in weights, reduction in FLOPs, and inference wall-clock time speedup.
Researcher Affiliation	Collaboration	1Department of Computer Science, University of Toronto, Toronto, Canada 2Vector Institute, Toronto, Canada 3NVIDIA.
Pseudocode	Yes	Algorithm 1 Pruning in the Kronecker-factored eigenbasis, i.e., Eigen Damage. For simplicity, we focus on a single layer. denotes elementwise mutliplication.
Open Source Code	Yes	Our code is available at here.
Open Datasets	Yes	We make use of three standard benchmark datasets: CIFAR10, CIFAR100 (Krizhevsky, 2009) and Tiny-Image Net4.
Dataset Splits	No	The paper mentions training and testing on standard benchmark datasets (CIFAR10, CIFAR100, Tiny-Image Net) but does not explicitly state the specific train/validation/test splits, their percentages, or sample counts, nor does it refer to a predefined standard split for all three parts.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running the experiments.
Software Dependencies	No	The paper mentions training with SGD and various neural network architectures (VGGNet, ResNet) and methods (K-FAC, NN Slimming) but does not list specific software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	We train the networks for 150 epochs for CIFAR datasets and 300 epochs for Tiny Image Net with an initial learning rate of 0.1 and weight decay of 2e 4. The learning rate is decayed by a factor of 10 at 1/4 of the total number of training epochs. For the networks trained with L1 sparsity on Batch Norm, we followed the same settings as in Liu et al. (2017). After pruning, the network is finetuned for 150 epochs with an initial learning rate of 1e 3 and weight decay of 1e 4.