The Role of Deconfounding in Meta-learning
Authors: Yinjie Jiang, Zhengyu Chen, Kun Kuang, Luotian Yuan, Xinhai Ye, Zhihua Wang, Fei Wu, Ying Wei
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare our methods with the state-of-the-art solution of memorization overfitting Meta Mix (Yao et al., 2021). We evaluate the performance on several backbones, such as MAML (Finn et al., 2017), ANIL (Raghu et al., 2020), Meta SGD (Li et al., 2017), and T-NET (Lee & Choi, 2018) (together with Meta Mix in Appendix E.4), to show the compatibility of our methods. In addition, the ablation study and the analysis of hyperparameters show the robustness of our methods. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Technology, Zhejiang University, Hangzhou, China 2Shanghai Institute for Advanced Study of Zhejiang University, Shanghai, China 3 Shanghai AI Laboratory, Zhejiang University, Shanghai, China 4Department of Computer Science, City University of Hong Kong, Hong Kong, China. |
| Pseudocode | Yes | Algorithm 1 Meta-training Process of MAML-Dropout and MAML-Bins |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing the code for the work described in this paper, nor does it provide a direct link to a source-code repository for their implementation. The link provided in Appendix D.2 ('code link: https://github.com/google-research/google-research/tree/master/meta learning without memorization/pose data') refers to a dataset for a baseline, not the authors' own source code. |
| Open Datasets | Yes | Drug activity prediction: 'Following Yao et al. (2021), we apply our methods to the drug activity prediction task (Martin et al., 2019).' Pose prediction: 'We also evaluate another regression task created from Pascal 3D data (Xiang et al., 2014).' Image Classification: 'Omniglot (Lake et al., 2011) and Mini Imagenet (Vinyals et al., 2016).' |
| Dataset Splits | Yes | Drug Activity Prediction: 'We split the tasks into meta-training tasks, meta-validation tasks and meta-testing tasks in the same way as Yao et al. (2021).' Pose Prediction: 'we randomly select 50 objects for meta-training and the other 15 objects for meta-testing.' Image Classification: 'non-mutually-exclusive N-way K-shot classification means each class is assigned with an unchangeable label from 1 to N in different tasks and training steps.' |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA A100, RTX 2080 Ti), CPU models, or specific cloud instance types used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details including version numbers for libraries or frameworks (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1). |
| Experiment Setup | Yes | Sinusoid Regression: 'Implementation of our methods (MAML-Dropout+MAML-Bins) in this experiment uses 5 bins and a dropout rate of 0.3.' Drug Activity Prediction: 'The base model of drug activity prediction is a two-layer Multilayer Perceptron(MLP) neural network with 500 neurons in each layer. Each fully connected layer is followed by a batch normalization layer and leaky Re LU activation. In either meta-training or meta-testing, the number of inner-loop adaptation steps equals to 10. During meta-training, the task batch size, the outer-loop learning rate, the inner-loop learning rate are set to 8, 0.001 and 0.01. The meta-training process altogether runs for 50 epochs while 60 epochs using Dropout, each of which includes 500 iterations. Dropout rate is set to be 0.1.' Pose Prediction: 'Implementation of our methods (MAML-Dropout+MAML-Bins) in this experiment uses 5 bins and a dropout rate of 0.2.' Image Classification: 'We use a four-block convolutional network, which is the same as the model used in (Yao et al., 2021)... We evaluate different meta-learning backbones and compare them with our methods (MAML-Dropout+MAML-Bins) using 5 bins and a dropout rate of 0.1.' |