Improving Generalization in Meta-learning via Task Augmentation
Authors: Huaxiu Yao, Long-Kai Huang, Linjun Zhang, Ying Wei, Li Tian, James Zou, Junzhou Huang, Zhenhui () Li
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Meta Mix and Channel Shuffle outperform state-of-the-art results by a large margin across many datasets and are compatible with existing meta-learning algorithms. 6. Experiments: To show the effectiveness of Meta Mix, we conduct experiments on three meta-learning problems, namely: (1) drug activity prediction, (2) pose prediction, and (3) image classification. |
| Researcher Affiliation | Collaboration | 1Stanford University, CA, USA 2Tencent AI Lab, Shenzhen, China 3Rutgers University, NJ, USA 4City University of Hong Kong, Hong Kong 5Pennsylvania State University, PA, USA. |
| Pseudocode | Yes | Algorithm 1 Meta-training Process of MAML-Meta Mix |
| Open Source Code | No | The paper does not provide any specific links to source code repositories or explicitly state that the code for their methodology is being released. |
| Open Datasets | Yes | We solve a real-world application of drug activity prediction (Martin et al., 2019)... following (Yin et al., 2020), we use the regression dataset created from Pascal 3D data (Xiang et al., 2014)... standard benchmarks (e.g., Omniglot (Lake et al., 2011) and Mini Imagenet (Vinyals et al., 2016)) are considered as mutually-exclusive tasks... |
| Dataset Splits | Yes | We randomly selected 100 assays for meta-testing, 76 for meta-validation and the rest for meta-training. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for experiments, such as CPU/GPU models or memory specifications. |
| Software Dependencies | No | The paper mentions meta-learning algorithms and models (MAML, Meta SGD, ANIL, etc.) but does not specify any software dependencies like programming languages, libraries, or frameworks with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | Drug Activity Prediction: We use a base model of two fully connected layers with 500 hidden units. In Beta(α, β), we set α = β= 0.5. More details on the dataset and settings are discussed in Appendix D.1. Pose Prediction: The base model consists of an encoder with three convolutional blocks and a decoder with four convolutional blocks. For Meta Mix, we set α=β =0.5 in Beta(α, β) and only perform Mainfold Mixup on the decoder... Image Classification: We use the standard four-block convolutional neural network as the base model. We set α=β =2.0 for all datasets. Detailed descriptions of experiment settings and hyperparameters are discussed in Appendix D.3. |