A Closer Look at the Training Strategy for Modern Meta-Learning
Authors: JIAXIN CHEN, Xiao-Ming Wu, Yanke Li, Qimai LI, Li-Ming Zhan, Fu-lai Chung
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This paper conducts a theoretical investigation of this training strategy on generalization. From a stability perspective, we analyze the generalization error bound of generic meta-learning algorithms trained with such strategy. We show that the S/Q episodic training strategy naturally leads to a counterintuitive generalization bound of O(1/ n), which only depends on the task number n but independent of the inner-task sample size m. Under the common assumption m << n for few-shot learning, the bound of O(1/ n) implies strong generalization guarantees for modern meta-learning algorithms in the few-shot regime. To further explore the influence of training strategies on generalization, we propose a leave-one-out (LOO) training strategy for meta-learning and compare it with S/Q training. Experiments on standard few-shot regression and classification tasks with popular meta-learning algorithms validate our analysis. |
| Researcher Affiliation | Academia | Jiaxin Chen1, Xiao-Ming Wu1, , Yanke Li2, Qimai Li1, Li-Ming Zhan1, and Fu-lai Chung1, 1Department of Computing, The Hong Kong Polytechnic University 2Department of Mathematics, ETH Zurich {jiax.chen, qee-mai.li, lmzhan.zhan}@connect.polyu.hk, {xiao-ming.wu, korris.chung}@polyu.edu.hk, yankli@student.ethz.ch |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code can be downloaded from https://github.com/jiaxinchen666/meta-theory. |
| Open Datasets | Yes | Few-shot classification. We follow the standard experimental setting proposed in [35] using the real-life dataset mini Imagenet. This dataset has 100 classes and is split into a training set of 64 classes, a test set of 20 classes and a validation set of 16 classes. |
| Dataset Splits | Yes | This dataset has 100 classes and is split into a training set of 64 classes, a test set of 20 classes and a validation set of 16 classes. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | We implement the meta-algorithms MAML [18] and Bilevel Programming [19] by using a MLP with two hidden layers of size 40 with Re LU activation function. Both the input layer and the output layer have dimensionality 1. ... We implement MAML [18] and Proto Net [31] using the Conv-4 backbone and follow the implementation details in [8]. We set m = 5, q = 1 for regression and m = 1, q = 1 for classification. |