MATE: Plugging in Model Awareness to Task Embedding for Meta Learning
Authors: Xiaohan Chen, Zhangyang Wang, Siyu Tang, Krikamol Muandet
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that MATE can help the learning agent to adapt faster and better to new tasks, thanks to the new model-aware inductive prior guiding to constrain the hypothesis space. We illustrate that this new inductive bias is highly informative and adaptive across tasks, as a result of the proposed instance-adaptive soft feature-selection. We empirically demonstrate on two few-shot learning benchmarks that MATE improve up to 1% 5-shot accuracy, on top of a state-of-the-art meta learner backbones", showing MATE to be generally effective and easy-to-use. and "4 Experiments In this section, we first describe the implementation details (Section 4.1), and then benchmark MATE on two few-shot classification datasets, CIFAR-FS [6] and mini Image Net [53] (Sections 4.2 and 4.3). We complement our quantitative results with a visualization of the embedding produced by MATE compared to model-agnostic embeddings (see Supplementary). |
| Researcher Affiliation | Academia | Xiaohan Chen, Zhangyang Wang Department of Electrical and Computer Engineering University of Texas at Austin {xiaohan.chen, atlaswang}@utexas.edu Siyu Tang Department of Computer Science ETH Zürich siyu.tang@inf.ethz.ch Krikamol Muandet Max Planck Institute for Intelligent Systems Tübingen, Germany krikamol@tuebingen.mpg.de |
| Pseudocode | No | The paper provides conceptual diagrams (Figure 1, Figure 2) but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source codes for this paper are available at https://github.com/VITA-Group/MATE. |
| Open Datasets | Yes | CIFAR-FS [6] is a popular few-shot classification benchmark... It is a variant of CIFAR-100 [19]..." and "mini Image Net [53] is a larger benchmark in which 100 classes are selected from Image Net [39]... |
| Dataset Splits | Yes | Episodes in a dataset are divided into three sets meta-training Strn, meta-validation Sval, and meta-testing Stst." and "CIFAR-FS ... by randomly splitting it into meta-training, meta-validation and meta-testing sets, containing 64, 16 and 20 classes, respectively." and "mini Image Net ... We follow the popular split in [38] to employ 64 classes for meta-training, 16 for meta-validation and 20 for meta-testing. |
| Hardware Specification | No | The paper does not provide specific hardware details such as CPU/GPU models, memory, or processing units used for the experiments. It only describes software settings and model architectures. |
| Software Dependencies | No | The paper does not specify the version numbers of software dependencies (e.g., Python, PyTorch/TensorFlow, CUDA). It mentions adapting from Meta Opt Net's implementation but doesn't list the specific software stack. |
| Experiment Setup | Yes | Optimizer. We identically follow the practice in [21] for fair comparison. We use stochastic gradient descent (SGD) with momentum 0.9 and weight decay 0.0005. The SGD starts with an initial learning rate of 0.1, that is decayed to 0.006 at epoch 20. The meta-training phase takes a total of 30 epochs, each epoch consisting of 1,000 mini-batches. Each mini-batch samples 8 episodes. |