Meta Architecture Search

Authors: Albert Shaw, Wei Wei, Weiyang Liu, Le Song, Bo Dai

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that on Imagenet classification, we can find a model that achieves 25.7% top-1 error and 8.1% top-5 error by adapting the architecture in less than an hour from an 8 GPU days pretrained meta-network. We believe our framework will open up new possibilities for efficient and massively scalable architecture search research across multiple tasks.
Researcher Affiliation Collaboration Albert Shaw1 Wei Wei2 Weiyang Liu1 Le Song1,3 Bo Dai1,2 1Georgia Institute of Technology 2Google Research 3Ant Financial
Pseudocode Yes Algorithm 1 Bayesian meta Architecture SEarch (BASE)
Open Source Code Yes The code repository is available at https://github.com/ashaw596/meta_architecture_search.
Open Datasets Yes We show that on Imagenet classification, we can find a model that achieves 25.7% top-1 error and 8.1% top-5 error by adapting the architecture in less than an hour from an 8 GPU days pretrained meta-network.
Dataset Splits No To train our meta-network over a wide distribution of tasks with different image sizes, we define a new space of classification tasks by randomly selecting 10 Imagenet [7] classes and downsampling the images to 32 32, 64 64, or 224 224 image sizes.
Hardware Specification Yes All experiments were conducted with Nvidia 1080 Ti GPUs.
Software Dependencies No No specific versions of software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA) are mentioned, only general algorithmic components like Gumbel-Softmax.
Experiment Setup Yes The meta-network was trained for 130 epochs. At each epoch, we sampled and trained on a total of 24 tasks, sampling 8 10-class discrimination tasks each from Imagenet32, Imagenet64, and Imagenet224.