reproducibilityindex.ai

Large-Scale Evolution of Image Classifiers

Authors: Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Jie Tan, Quoc V. Le, Alexey Kurakin

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Training and validation is done on the CIFAR-10 dataset. This dataset consists of 50,000 training examples and 10,000 test examples, all of which are 32 x 32 color images labeled with 1 of 10 common object classes (Krizhevsky & Hinton, 2009). 5,000 of the training examples are held out in a validation set. The remaining 45,000 examples constitute our actual training set. The training set is augmented as in He et al. (2016). The CIFAR-100 dataset has the same number of dimensions, colors and examples as CIFAR-10, but uses 100 classes, making it much more challenging. ... We used the algorithm in Section 3 to perform several experiments. Each experiment evolves a population in a few days, typiﬁed by the example in Figure 1. The ﬁgure also contains examples of the architectures discovered, which turn out to be surprisingly simple. ... Across all 5 experiment runs, the best model by validation accuracy has a testing accuracy of 94.6 %.
Researcher Affiliation	Industry	Esteban Real 1 Sherry Moore 1 Andrew Selle 1 Saurabh Saxena 1 Yutaka Leon Suematsu 2 Jie Tan 1 Quoc V. Le 1 Alexey Kurakin 1 ... 1Google Brain, Mountain View, California, USA 2Google Research, Mountain View, California, USA.
Pseudocode	No	The paper describes the evolutionary algorithm steps in text (Section 3.1) and mentions "Code and more detail about the methods described below can be found in Supplementary Section S1," but it does not include a structured pseudocode or algorithm block in the main paper.
Open Source Code	Yes	Code and more detail about the methods described below can be found in Supplementary Section S1.
Open Datasets	Yes	Training and validation is done on the CIFAR-10 dataset. This dataset consists of 50,000 training examples and 10,000 test examples, all of which are 32 x 32 color images labeled with 1 of 10 common object classes (Krizhevsky & Hinton, 2009). ... The CIFAR-100 dataset has the same number of dimensions, colors and examples as CIFAR-10, but uses 100 classes, making it much more challenging.
Dataset Splits	Yes	This dataset consists of 50,000 training examples and 10,000 test examples, all of which are 32 x 32 color images labeled with 1 of 10 common object classes (Krizhevsky & Hinton, 2009). 5,000 of the training examples are held out in a validation set. The remaining 45,000 examples constitute our actual training set.
Hardware Specification	No	To achieve scale, we developed a massively-parallel, lock-free infrastructure. Many workers operate asynchronously on different computers. ... All experiments occupied the same amount and type of hardware.
Software Dependencies	No	Training is done with Tensor Flow (Abadi et al., 2016), using SGD with a momentum of 0.9 (Sutskever et al., 2013), a batch size of 50, and a weight decay of 0.0001.
Experiment Setup	Yes	The population size is 1000 individuals, unless otherwise stated. The number of workers is always 1 4 of the population size. ... Training is done with Tensor Flow (Abadi et al., 2016), using SGD with a momentum of 0.9 (Sutskever et al., 2013), a batch size of 50, and a weight decay of 0.0001. Each training runs for 25,600 steps, a value chosen to be brief enough so that each individual could be trained in a few seconds to a few hours, depending on model size.