Understanding Train-Validation Split in Meta-Learning with Neural Networks

Authors: Xinzhe Zuo, Zixiang Chen, Huaxiu Yao, Yuan Cao, Quanquan Gu

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theory by conducting experiment on both synthetic and real datasets.
Researcher Affiliation Academia Department of Mathematics, University of California, Los Angles zxz@math.ucla.edu Department of Computer Science, University of California, Los Angles {chenzx19, qgu}@cs.ucla.edu Department of Computer Science, Stanford University huaxiu@cs.stanford.edu Department of Statistics & Actuarial Science, University of Hong Kong yuancao@hku.hk
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access (link, explicit statement) to the source code for the methodology described.
Open Datasets Yes Real-world Data. In our experiments, we further justify our theoretical findings in two real-world datasets: Rainbow MNIST, mini Imagenet, which are discussed as follows. ...Following (Yao et al., 2021), Rainbow MNIST is a 10-way meta-learning dataset... Following the traditional meta-learning setting (Finn & Levine, 2017; Snell et al., 2017), mini Imagenet dataset is split into meta-training, meta-validation and meta-testing classes
Dataset Splits Yes mini Imagenet dataset is split into meta-training, meta-validation and meta-testing classes, where 64/16/20 classes are used for meta-training/validation/testing. We adopt the traditional Nway, K-shot setting to split the training and validation set in our experiment, where N=5 and K=1 in this paper (i.e., 5-way, 1-shot learning).
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU/CPU models, memory details) used for running the experiments.
Software Dependencies No The paper mentions "Huberized-Re LU" as the activation function and "four-block convolutional layers as the base learner" but does not specify any software libraries or their version numbers.
Experiment Setup Yes Synthetic data. We generate synthetic data to test our theory. For our data generation we choose: d = 1000, K = 343, n = 10, σξ = 10.42, σs = 0.00066, ν 2 = 1. For our neural network we choose: m = 18, σ0 = 0.032. And finally we choose the following parameters for inner and outer level optimization: γ = 0.001, J = 5, η = 0.0001. ... The number of inner-loop steps is set as 5. The inner-loop and outer-loop learning rates are set as: 0.01 and 0.001 (mini Imagenet), 0.1 and 0.01 (Rainbow MNIST), respectively.