Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Understanding and Simplifying One-Shot Architecture Search
Authors: Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, Quoc Le
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With careful experimental analysis, we show that it is possible to efficiently identify promising architectures from a complex search space without either hypernetworks or RL. and 4. One-Shot Model Experiments |
| Researcher Affiliation | Industry | 1Google Brain, Mountain View, CA. Correspondence to: Gabriel Bender <EMAIL>. |
| Pseudocode | No | The paper includes diagrams but no explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not mention any open-source code release or provide links to a code repository for the described methodology. |
| Open Datasets | Yes | On CIFAR-10 we used a 45,000 element training set, 5,000 element validation set, and 10,000 element test set. Image Net was partitioned into a 1,281,167 training set, 50,046 element validation set, and 50,000 element test set. |
| Dataset Splits | Yes | On CIFAR-10 we used a 45,000 element training set, 5,000 element validation set, and 10,000 element test set. Image Net was partitioned into a 1,281,167 training set, 50,046 element validation set, and 50,000 element test set. |
| Hardware Specification | Yes | Each one-shot model was trained for 5,000 10,000 steps (113 225 epochs) on a cluster of 16 P100 GPUs. and The One-Shot model was trained for 15k steps (about 47 epochs or 6 hours) with a batch size of 4,096 on four Cloud TPUs (16 chips). |
| Software Dependencies | No | Experiments were implemented using Tensor Flow (Abadi et al., 2016). However, no specific version number for TensorFlow or any other software dependency is provided. |
| Experiment Setup | Yes | Each worker used a batch size of 64, which was divided into two ghost batches of size 32. We used a global learning rate of 0.1 and Nesterov momentum 0.9. and At the start of training, dropout was effectively disabled, while at the end of training, we had a dropout rate determined by the coefficient r = 0.1. |