reproducibilityindex.ai

Qualitatively characterizing neural network optimization problems

Authors: Ian Goodfellow and Oriol Vinyals

ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We introduce a simple analysis technique to look for evidence that such networks are overcoming local optima. We ﬁnd that, in fact, on a straight path from initialization to solution, a variety of state of the art neural networks never encounter any signiﬁcant obstacles.In this paper, we present a variety of simple experiments designed to roughly characterize the objective functions involved in neural network training.
Researcher Affiliation	Collaboration	Google Inc., Mountain View, CA Department of Electrical Engineering, Stanford University, Stanford, CA {goodfellow,vinyals}@google.com, asaxe@stanford.edu
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Maxout network: This model was retrained using the publicly available implementation used by Goodfellow et al. (2013c). The code is available at: https://github.com/lisa-lab/pylearn2/blob/master/pylearn2/scripts/ papers/maxout/mnist_pi.yaml
Open Datasets	Yes	For these experiments we use the MNIST dataset (Le Cun et al., 1998). The linear interpolation experiment for a convolutional maxout network on the CIFAR-10 dataset (Krizhevsky & Hinton, 2009). LSTM regularized with dropout (Hochreiter & Schmidhuber, 1997; Zaremba et al., 2014) on the Penn Treebank dataset (Marcus et al., 1993).
Dataset Splits	No	The paper mentions the use of a "validation set" in figures (e.g., "J(θ) validation" in Figure 1) and in the text (e.g., "early stopping on a validation set criterion"). However, it does not provide specific details on the split percentages or sizes of the validation set for reproduction.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, number of machines) used to run the experiments.
Software Dependencies	No	The paper acknowledges the developers of "Theano(Bergstra et al., 2010; Bastien et al., 2012) and Pylearn2(Goodfellow et al., 2013b)" but does not specify version numbers for these or any other software dependencies.
Experiment Setup	Yes	A EXPERIMENT DETAILS All of our experiments except for the sigmoid network were using hyperparameters taken directly from the literature. We fully specify each of them here. Adversarially trained maxout network: This model is the one used by Goodfellow et al. (2014). There is no public conﬁguration for it, but the paper describes how to modify the previous best maxout network to obtain it. Re LU network without dropout: We simply removed the dropout from the preceding conﬁguration ﬁle.