reproducibilityindex.ai

Network Morphism

Authors: Tao Wei, Changhu Wang, Yong Rui, Chang Wen Chen

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on benchmark datasets and typical neural networks demonstrate the effectiveness of the proposed network morphism scheme.
Researcher Affiliation	Collaboration	Microsoft Research, Beijing, China, 100080 Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, 14260
Pseudocode	Yes	Algorithm 1 General Network Morphism; Algorithm 2 Practical Network Morphism
Open Source Code	No	The paper does not provide any explicit statements about releasing open-source code or links to a code repository for the described methodology.
Open Datasets	Yes	The first experiment is conducted on the MNIST dataset (Le Cun et al., 1998). Extensive experiments were conducted on the CIFAR10 dataset (Krizhevsky & Hinton, 2009). We also conduct experiments on the Image Net dataset (Russakovsky et al., 2014).
Dataset Splits	Yes	MNIST is a standard dataset for handwritten digit recognition, with 60,000 training images and 10,000 testing images. CIFAR10 is an image recognition database composed of 32 32 color images. It contains 50,000 training images and 10,000 testing images. The models were trained on 1.28 million training images and tested on 50,000 validation images.
Hardware Specification	No	VGG16 was trained for around 2~3 months for a single GPU time (Simonyan & Zisserman, 2014). This only mentions "single GPU" without specifying a model or other hardware details.
Software Dependencies	No	The baseline network we adopted is the Caffe (Jia et al., 2014) cifar10_quick model. This mentions "Caffe" but does not provide a specific version number.
Experiment Setup	Yes	The sharp drop and increase in Fig. 8 are caused by the changes of learning rates. Since the parent network was learned with a much ﬁner learning rate (1e-5) at the end of its training, we recovered it to a courser learning rate (1e-3) from the start, and hence there is an initial sharp drop. At 20k/30k iterations, the learning rate was reduced to 1e-4/1e-5, which caused the sharp increase.