Improving Generalization in Meta-learning via Task Augmentation

Authors: Huaxiu Yao, Long-Kai Huang, Linjun Zhang, Ying Wei, Li Tian, James Zou, Junzhou Huang, Zhenhui () Li

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Meta Mix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets and are compatible with existing meta-learning algorithms. 6. Experiments: To show the effectiveness of Meta Mix, we conduct experiments on three meta-learning problems, namely: (1) drug activity prediction, (2) pose prediction, and (3) image classification.
Researcher Affiliation Collaboration 1Stanford University, CA, USA 2Tencent AI Lab, Shenzhen, China 3Rutgers University, NJ, USA 4City University of Hong Kong, Hong Kong 5Pennsylvania State University, PA, USA.
Pseudocode Yes Algorithm 1 Meta-training Process of MAML-Meta Mix
Open Source Code No The paper does not provide any specific links to source code repositories or explicitly state that the code for their methodology is being released.
Open Datasets Yes We solve a real-world application of drug activity prediction (Martin et al., 2019)... following (Yin et al., 2020), we use the regression dataset created from Pascal 3D data (Xiang et al., 2014)... standard benchmarks (e.g., Omniglot (Lake et al., 2011) and Mini Imagenet (Vinyals et al., 2016)) are considered as mutually-exclusive tasks...
Dataset Splits Yes We randomly selected 100 assays for meta-testing, 76 for meta-validation and the rest for meta-training.
Hardware Specification No The paper does not provide specific details on the hardware used for experiments, such as CPU/GPU models or memory specifications.
Software Dependencies No The paper mentions meta-learning algorithms and models (MAML, Meta SGD, ANIL, etc.) but does not specify any software dependencies like programming languages, libraries, or frameworks with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes Drug Activity Prediction: We use a base model of two fully connected layers with 500 hidden units. In Beta(α, β), we set α = β= 0.5. More details on the dataset and settings are discussed in Appendix D.1. Pose Prediction: The base model consists of an encoder with three convolutional blocks and a decoder with four convolutional blocks. For Meta Mix, we set α=β =0.5 in Beta(α, β) and only perform Mainfold Mixup on the decoder... Image Classification: We use the standard four-block convolutional neural network as the base model. We set α=β =2.0 for all datasets. Detailed descriptions of experiment settings and hyperparameters are discussed in Appendix D.3.