Rapid Model Architecture Adaption for Meta-Learning
Authors: Yiren Zhao, Xitong Gao, I Shumailov, Nicolo Fusi, Robert Mullins
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate H-Meta-NAS on a range of popular few-shot learning benchmarks. For each dataset, we search for the meta-architecture and meta-parameters. We then adapt the meta-architecture with respect to a target hardware-constraint pair. In the evaluation stage, we then re-train the obtained hardware-aware task-specific architecture to convergence and report the final accuracy. |
| Researcher Affiliation | Collaboration | Yiren Zhao Imperial College London a.zhao@imperial.ac.uk Xitong Gao Shenzhen Institute of Advanced Technology, CAS xt.gao@siat.ac.cn Ilia Shumailov University of Oxford ilia.shumailov@chch.ox.ac.uk Nicolo Fusi Microsoft Research fusi@microsoft.com Robert D Mullins University of Cambridge robert.mullins@cl.cam.ac.uk |
| Pseudocode | Yes | The full algorithm is detailed in Appendix D. (Appendix D contains 'Algorithm 1: Genetic Algorithm for Meta-Architecture Search') |
| Open Source Code | No | No explicit statement or link providing access to the source code for the methodology described in this paper. |
| Open Datasets | Yes | We consider three popular datasets in the few-shot learning community: Omniglot, Mini-Image Net and Few-shot CIFAR100. Omniglot is a handwritten digits recognition task, containing 1623 samples [Lake et al., 2015]. Mini-Image Net is first introduced by Vinyals et al.. This dataset contains images of 100 different classes from the ILSVRC-12 dataset [Deng et al., 2009], the splits are taken from Ravi et al.[Ravi and Larochelle, 2016]. |
| Dataset Splits | Yes | All tasks are divided into three sets, namely meta-training (Ttrain), meta-validation (Tval) and meta-testing (Ttest) sets. We use the meta train/validation/test splits originally used Vinyals et al. Vinyals et al. [2016]. These splits are over 1028/172/423 classes (characters). We pick 16K training samples and 10K validation samples to train and test the latency predictor, which is the same setup used in OFA. |
| Hardware Specification | Yes | Table 1: Comparing latency predictor with our proposed profiling... Hardware Metric Latency Predictor Layer-wise Profiling 2080 Ti GPU... Intel i9 CPU... Pi Zero... For instance, running a single network inference of VGG9 on the Raspberry Pi Zero with a 1GHz single-core ARMv6 CPU takes around 2.365 seconds to finish. We extensively evaluate H-Meta-NAS on various hardware platforms (GPU, CPU, m CPU, Io T, ASIC accelerator) and constraints (latency and model size). |
| Software Dependencies | Yes | We use Python 3.8.3, PyTorch 1.7.0, Torchmeta 1.7.0 |
| Experiment Setup | Yes | The genetic algorithm has a pool size P and number of iterations M. We pick pi = 1.0 and es = 30, because the super-net reaches a relatively stable training accuracy at that point. We then start the decaying process, and the value = 5 is determined through a hyper-parameter study shown in our Appendix B. |