reproducibilityindex.ai

Comparing Rewinding and Fine-tuning in Neural Network Pruning

Authors: Alex Renda, Jonathan Frankle, Michael Carbin

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study neural network pruning on a variety of standard architectures for image classiﬁcation and machine translation. Speciﬁcally, we consider Res Net-56 (He et al., 2016) for CIFAR-10 (Krizhevsky, 2009), Res Net-34 and Res Net-50 (He et al., 2016) for Image Net (Russakovsky et al., 2015), and GNMT (Wu et al., 2016) for WMT16 EN-DE.
Researcher Affiliation	Academia	Alex Renda MIT CSAIL renda@csail.mit.edu Jonathan Frankle MIT CSAIL jfrankle@csail.mit.edu Michael Carbin MIT CSAIL mcarbin@csail.mit.edu
Pseudocode	Yes	Algorithm 1 Our pruning algorithm
Open Source Code	Yes	Our implementation and the data from the experiments in this paper are available at: https://github.com/lottery-ticket/rewinding-iclr20-public
Open Datasets	Yes	We study neural network pruning on a variety of standard architectures for image classiﬁcation and machine translation. Speciﬁcally, we consider Res Net-56 (He et al., 2016) for CIFAR-10 (Krizhevsky, 2009), Res Net-34 and Res Net-50 (He et al., 2016) for Image Net (Russakovsky et al., 2015), and GNMT (Wu et al., 2016) for WMT16 EN-DE.
Dataset Splits	Yes	For vision networks, we use 20% of the original test set, selected at random, as the validation set; the remainder of the original test set is used to report test accuracies. For WMT16 EN-DE, we use newstest2014 as the validation set (following Wu et al., 2016), and newstest2015 as the test set (following Zhu & Gupta, 2018).
Hardware Specification	Yes	We gratefully acknowledge the support of Google, which provided us with access to the TPU resources necessary to conduct experiments on Image Net and WMT through the Tensor Flow Research Cloud. In particular, we express our gratitude to Zak Stone. We gratefully acknowledge the support of IBM, which provided us with access to the GPU resources necessary to conduct experiments on CIFAR-10 through the MIT-IBM Watson AI Lab.
Software Dependencies	No	The paper mentions optimizers like Nesterov SGD and Adam, and refers to TensorFlow through 'Tensor Flow Research Cloud' and provided links to GitHub repositories for some models. However, it does not specify version numbers for any software components or libraries, which is required for reproducibility.
Experiment Setup	Yes	Table 1: Networks, datasets, and hyperparameters. We use standard implementations available online and standard hyperparameters. All accuracies are in line with baselines reported for these networks (Liu et al., 2019; He et al., 2018; Gale et al., 2019; Wu et al., 2016; Zhu & Gupta, 2018). It details Optimizer, Learning rate (with schedule), Batch size, Weight decay, and Epochs for each network/dataset.