Evaluating The Search Phase of Neural Architecture Search
Authors: Kaicheng Yu, Christian Sciuto, Martin Jaggi, Claudiu Musat, Mathieu Salzmann
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a series of experiments on the Penn Tree Bank (PTB) (Marcus et al., 1994a) and CIFAR10 (Krizhevsky et al., 2009) datasets, in which we compared the state-of-the-art NAS algorithms whose code is publicly available DARTS (Liu et al., 2019b), NAO (Luo et al., 2018) and ENAS (Pham et al., 2018) to our random policy. |
| Researcher Affiliation | Collaboration | Kaicheng Yu Computer vision lab, EPFL kaicheng.yu@epfl.ch Christian Sciuto Daskell christian.sciuto@daskell.com Martin Jaggi Machine learning and optimization lab, EPFL martin.jaggi@epfl.ch Claudiu Musat Swisscom Digital Lab claudiu.musat@swisscom.com Mathieu Salzmann Computer vision lab, EPFL mathieu.salzmann@epfl.ch |
| Pseudocode | No | The paper describes methods and experiments in text, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is publicly available at https://github.com/kcyu2014/eval-nas. |
| Open Datasets | Yes | We perform a series of experiments on the Penn Tree Bank (PTB) (Marcus et al., 1994a) and CIFAR10 (Krizhevsky et al., 2009) datasets... |
| Dataset Splits | No | The paper mentions using Penn Tree Bank and CIFAR-10 datasets and refers to 'validation' and 'test split' in the context of evaluation, but does not explicitly provide specific percentages or counts for training, validation, and test dataset splits for its experiments. |
| Hardware Specification | No | The paper refers to 'GPU hours' and 'GPU days' as resources but does not specify any particular hardware components such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper describes the use of various optimizers (RMSProp, SGD) and network architectures (LSTM, VAE) but does not provide specific version numbers for any software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | Once a best architecture is identified by the search phase, it is used for evaluation, i.e., we train the chosen architecture from scratch for 1000 epochs for RNN and 600 for CNN. For our RNN comparisons, we follow the procedure used in (Liu et al., 2019b; Pham et al., 2018; Luo et al., 2018) for the final evaluation, consisting of keeping the connections found for the best architecture in the search phase but increasing the hidden state size (to 850 in practice)... For ENAS, we set the LSTM sampler size to 64 and keep the temperature as 5.0. The number of aggregation step of each sampler training is set to 10. |