Understanding Train-Validation Split in Meta-Learning with Neural Networks
Authors: Xinzhe Zuo, Zixiang Chen, Huaxiu Yao, Yuan Cao, Quanquan Gu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our theory by conducting experiment on both synthetic and real datasets. |
| Researcher Affiliation | Academia | Department of Mathematics, University of California, Los Angles zxz@math.ucla.edu Department of Computer Science, University of California, Los Angles {chenzx19, qgu}@cs.ucla.edu Department of Computer Science, Stanford University huaxiu@cs.stanford.edu Department of Statistics & Actuarial Science, University of Hong Kong yuancao@hku.hk |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access (link, explicit statement) to the source code for the methodology described. |
| Open Datasets | Yes | Real-world Data. In our experiments, we further justify our theoretical findings in two real-world datasets: Rainbow MNIST, mini Imagenet, which are discussed as follows. ...Following (Yao et al., 2021), Rainbow MNIST is a 10-way meta-learning dataset... Following the traditional meta-learning setting (Finn & Levine, 2017; Snell et al., 2017), mini Imagenet dataset is split into meta-training, meta-validation and meta-testing classes |
| Dataset Splits | Yes | mini Imagenet dataset is split into meta-training, meta-validation and meta-testing classes, where 64/16/20 classes are used for meta-training/validation/testing. We adopt the traditional Nway, K-shot setting to split the training and validation set in our experiment, where N=5 and K=1 in this paper (i.e., 5-way, 1-shot learning). |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU/CPU models, memory details) used for running the experiments. |
| Software Dependencies | No | The paper mentions "Huberized-Re LU" as the activation function and "four-block convolutional layers as the base learner" but does not specify any software libraries or their version numbers. |
| Experiment Setup | Yes | Synthetic data. We generate synthetic data to test our theory. For our data generation we choose: d = 1000, K = 343, n = 10, σξ = 10.42, σs = 0.00066, ν 2 = 1. For our neural network we choose: m = 18, σ0 = 0.032. And finally we choose the following parameters for inner and outer level optimization: γ = 0.001, J = 5, η = 0.0001. ... The number of inner-loop steps is set as 5. The inner-loop and outer-loop learning rates are set as: 0.01 and 0.001 (mini Imagenet), 0.1 and 0.01 (Rainbow MNIST), respectively. |