reproducibilityindex.ai

Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution

Authors: Thomas Elsken, Jan Hendrik Metzen, Frank Hutter

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate LEMONADE for up to ﬁve objectives on two different search spaces for image classiﬁcation: (i) non-modularized architectures and (ii) cells that are used as repeatable building blocks within an architecture (Zoph et al., 2018; Zhong et al., 2018) and also allow transfer to other data sets. LEMONADE returns a population of CNNs covering architectures with 10 000 to 10 000 000 parameters. Within only 5 days on 16 GPUs, LEMONADE discovers architectures that are competitive in terms of predictive performance and resource consumption with hand-designed networks, such as Mobile Net V2 (Sandler et al., 2018), as well as architectures that were automatically designed using 40x greater resources (Zoph et al., 2018) and other multi-objective methods (Dong et al., 2018).
Researcher Affiliation	Collaboration	Thomas Elsken Bosch Center for Artiﬁcial Intelligence and University of Freiburg Thomas.Elsken@de.bosch.com Jan Hendrik Metzen Bosch Center for Artiﬁcial Intelligence Jan Hendrik.Metzen@de.bosch.com Frank Hutter University of Freiburg fh@cs.uni-freiburg.de
Pseudocode	Yes	Algorithm 1 LEMONADE
Open Source Code	No	The paper does not provide an explicit statement or link regarding the availability of open-source code for the described methodology.
Open Datasets	Yes	We present results for LEMONADE on searching neural architectures for CIFAR-10. ... We also transfer the discovered cells from the last setting to Image Net (Section 5.4) and its down-scaled version Image Net64x64 (Chrabaszcz et al., 2017) (Section 5.3).
Dataset Splits	Yes	The training set is split up in a training (45.000) and a validation (5.000) set for the purpose of architecture search.
Hardware Specification	Yes	Within only 5 days on 16 GPUs, LEMONADE discovers architectures that are competitive in terms of predictive performance and resource consumption with hand-designed networks... In terms of inference time (bottom right), LEMONADE clearly ﬁnds models superior to the baselines. We highlight that this result has been achieved based on using only 80 GPU days for LEMONADE compared to 2000 in Zoph et al. (2018) and with a signiﬁcantly more complex Search Space I... In detail, we measured the time for doing inference on a batch of 100 images on a Titan X GPU.
Software Dependencies	No	The paper mentions SGD, Batch Normalization, and cosine annealing as methods but does not specify software dependencies like programming language versions or library versions (e.g., PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	We apply the standard data augmentation scheme described by Loshchilov & Hutter (2017), as well as the recently proposed methods mixup (Zhang et al., 2017) and Cutout (Devries & Taylor, 2017). The training set is split up in a training (45.000) and a validation (5.000) set for the purpose of architecture search. We use weight decay (5 10 4) for all models. We use batch size 64 throughout all experiments. During architecture search as well as for generating the random search baseline, all models are trained for 20 epochs using SGD with cosine annealing (Loshchilov & Hutter, 2017), decaying the learning rate from 0.01 to 0. For evaluating the test performance, all models are trained from scratch on the training and validation set with the same setup as described above except for 1) we train for 600 epochs and 2) the initial learning rate is set to 0.025.