reproducibilityindex.ai

A Signal Propagation Perspective for Pruning Neural Networks at Initialization

Authors: Namhoon Lee, Thalaiyasingam Ajanthan, Stephen Gould, Philip H. S. Torr

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our modifications to the existing pruning at initialization method lead to improved results on all tested network models for image classification tasks. Furthermore, we empirically study the effect of supervision for pruning and demonstrate that our signal propagation perspective, combined with unsupervised pruning, can be useful in various scenarios where pruning is applied to non-standard arbitrarily-designed architectures.
Researcher Affiliation	Academia	Namhoon Lee1, Thalaiyasingam Ajanthan2, Stephen Gould2, Philip H. S. Torr1 1University of Oxford 2Australian National University 1{namhoon,phst}@robots.ox.ac.uk 2{thalaiyasingam.ajanthan, stephen.gould}@anu.edu.au
Pseudocode	No	The paper contains mathematical derivations and descriptions of processes but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	The code can be found here: https://github.com/namhoonlee/spp-public.
Open Datasets	Yes	Throughout experiments, we evaluate pruning results on MNIST, CIFAR-10, and Tiny-Image Net image classification tasks.
Dataset Splits	Yes	For all experiments, we use 10% of training set for the validation set, which corresponds to 5400, 5000, 9000 images for MNIST, CIFAR-10, Tiny-Iamge Net, respectively.
Hardware Specification	No	The paper mentions that computations can take "less than a few seconds on a modern computer" but does not provide any specific hardware details such as CPU/GPU models or memory specifications used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) needed to replicate the experiments.
Experiment Setup	Yes	For training of the pruned sparse networks, we use SGD with momentum and train up to 80k (for MNIST) or 100k (for CIFAR-10 and Tiny-Image Net) iterations. The initial learning rate is set to be 0.1 and is decayed by 1/10 at every 20k (MNIST) or 25k (CIFAR-10 and Tiny-Image Net). The mini-batch size is set to be 100, 128, 200 for MNIST, CIFAR-10, Tiny-Image Net, respectively.