reproducibilityindex.ai

A Closer Look at the Training Strategy for Modern Meta-Learning

Authors: JIAXIN CHEN, Xiao-Ming Wu, Yanke Li, Qimai LI, Li-Ming Zhan, Fu-lai Chung

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This paper conducts a theoretical investigation of this training strategy on generalization. From a stability perspective, we analyze the generalization error bound of generic meta-learning algorithms trained with such strategy. We show that the S/Q episodic training strategy naturally leads to a counterintuitive generalization bound of O(1/ n), which only depends on the task number n but independent of the inner-task sample size m. Under the common assumption m << n for few-shot learning, the bound of O(1/ n) implies strong generalization guarantees for modern meta-learning algorithms in the few-shot regime. To further explore the inﬂuence of training strategies on generalization, we propose a leave-one-out (LOO) training strategy for meta-learning and compare it with S/Q training. Experiments on standard few-shot regression and classiﬁcation tasks with popular meta-learning algorithms validate our analysis.
Researcher Affiliation	Academia	Jiaxin Chen1, Xiao-Ming Wu1, , Yanke Li2, Qimai Li1, Li-Ming Zhan1, and Fu-lai Chung1, 1Department of Computing, The Hong Kong Polytechnic University 2Department of Mathematics, ETH Zurich {jiax.chen, qee-mai.li, lmzhan.zhan}@connect.polyu.hk, {xiao-ming.wu, korris.chung}@polyu.edu.hk, yankli@student.ethz.ch
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The source code can be downloaded from https://github.com/jiaxinchen666/meta-theory.
Open Datasets	Yes	Few-shot classiﬁcation. We follow the standard experimental setting proposed in [35] using the real-life dataset mini Imagenet. This dataset has 100 classes and is split into a training set of 64 classes, a test set of 20 classes and a validation set of 16 classes.
Dataset Splits	Yes	This dataset has 100 classes and is split into a training set of 64 classes, a test set of 20 classes and a validation set of 16 classes.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers).
Experiment Setup	Yes	We implement the meta-algorithms MAML [18] and Bilevel Programming [19] by using a MLP with two hidden layers of size 40 with Re LU activation function. Both the input layer and the output layer have dimensionality 1. ... We implement MAML [18] and Proto Net [31] using the Conv-4 backbone and follow the implementation details in [8]. We set m = 5, q = 1 for regression and m = 1, q = 1 for classiﬁcation.