Rapid Model Architecture Adaption for Meta-Learning

Authors: Yiren Zhao, Xitong Gao, I Shumailov, Nicolo Fusi, Robert Mullins

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate H-Meta-NAS on a range of popular few-shot learning benchmarks. For each dataset, we search for the meta-architecture and meta-parameters. We then adapt the meta-architecture with respect to a target hardware-constraint pair. In the evaluation stage, we then re-train the obtained hardware-aware task-specific architecture to convergence and report the final accuracy.
Researcher Affiliation Collaboration Yiren Zhao Imperial College London a.zhao@imperial.ac.uk Xitong Gao Shenzhen Institute of Advanced Technology, CAS xt.gao@siat.ac.cn Ilia Shumailov University of Oxford ilia.shumailov@chch.ox.ac.uk Nicolo Fusi Microsoft Research fusi@microsoft.com Robert D Mullins University of Cambridge robert.mullins@cl.cam.ac.uk
Pseudocode Yes The full algorithm is detailed in Appendix D. (Appendix D contains 'Algorithm 1: Genetic Algorithm for Meta-Architecture Search')
Open Source Code No No explicit statement or link providing access to the source code for the methodology described in this paper.
Open Datasets Yes We consider three popular datasets in the few-shot learning community: Omniglot, Mini-Image Net and Few-shot CIFAR100. Omniglot is a handwritten digits recognition task, containing 1623 samples [Lake et al., 2015]. Mini-Image Net is first introduced by Vinyals et al.. This dataset contains images of 100 different classes from the ILSVRC-12 dataset [Deng et al., 2009], the splits are taken from Ravi et al.[Ravi and Larochelle, 2016].
Dataset Splits Yes All tasks are divided into three sets, namely meta-training (Ttrain), meta-validation (Tval) and meta-testing (Ttest) sets. We use the meta train/validation/test splits originally used Vinyals et al. Vinyals et al. [2016]. These splits are over 1028/172/423 classes (characters). We pick 16K training samples and 10K validation samples to train and test the latency predictor, which is the same setup used in OFA.
Hardware Specification Yes Table 1: Comparing latency predictor with our proposed profiling... Hardware Metric Latency Predictor Layer-wise Profiling 2080 Ti GPU... Intel i9 CPU... Pi Zero... For instance, running a single network inference of VGG9 on the Raspberry Pi Zero with a 1GHz single-core ARMv6 CPU takes around 2.365 seconds to finish. We extensively evaluate H-Meta-NAS on various hardware platforms (GPU, CPU, m CPU, Io T, ASIC accelerator) and constraints (latency and model size).
Software Dependencies Yes We use Python 3.8.3, PyTorch 1.7.0, Torchmeta 1.7.0
Experiment Setup Yes The genetic algorithm has a pool size P and number of iterations M. We pick pi = 1.0 and es = 30, because the super-net reaches a relatively stable training accuracy at that point. We then start the decaying process, and the value = 5 is determined through a hyper-parameter study shown in our Appendix B.