Neural Optimizer Search with Reinforcement Learning
Authors: Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On CIFAR-10, our method discovers several update rules that are better than many commonly used optimizers, such as Adam, RMSProp, or SGD with and without Momentum on a Conv Net model. These optimizers can also be transferred to perform well on different neural network architectures, including Google s neural machine translation system. |
| Researcher Affiliation | Industry | 1Google Brain. Correspondence to: Irwan Bello <ibello@google.com>, Barret Zoph <barretzoph@google.com>, Vijay Vasudevan <vrv@google.com>, Quoc V. Le <qvl@google.com>. |
| Pseudocode | No | The paper describes the architecture and process but does not include a dedicated pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | These child networks are trained on the CIFAR-10 dataset, one of the most benchmarked datasets in deep learning. |
| Dataset Splits | Yes | The child networks have a batch size of 100 and evaluate the update rule on a fixed held-out validation set of 5,000 examples. |
| Hardware Specification | No | The paper mentions 'CPUs' and 'GPUs' but does not specify exact models or types (e.g., 'Intel Core i7' or 'NVIDIA A100'). |
| Software Dependencies | No | The paper mentions 'Tensor Flow (Abadi et al., 2016)' but does not provide a specific version number for it or any other software dependencies. |
| Experiment Setup | Yes | Across all experiments, our controller RNN is trained with the ADAM optimizer with a learning rate of 10 5 and a minibatch size of 5. The controller is a single-layer LSTM with hidden state size 150 and weights are initialized uniformly at random between -0.08 and 0.08. We also use an entropy penalty to aid in exploration. This entropy penalty coefficient is set to 0.0015. ... We set ϵ to 10 8, β1 to 0.9 and β2 = β3 to 0.999. |