MAML is a Noisy Contrastive Learner in Classification
Authors: Chia Hsiang Kao, Wei-Chen Chiu, Pin-Yu Chen
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on both mini-Image Net and Omniglot datasets to validate the consistent improvement brought by our proposed method. In this section, we provide empirical evidence of the supervised contrastiveness of MAML and show that zero-initialization of w0, reduction in the initial norm of w0, or the application of zeroing trick can speed up the learning profile. |
| Researcher Affiliation | Collaboration | National Yang Ming Chiao Tung University, Taiwan IBM Research chkao.md04@nycu.edu.tw walon@cs.nctu.edu.tw pin-yu.chen@ibm.com |
| Pseudocode | Yes | A ORIGINAL MAML AND MAML WITH THE ZEROING TRICK Algorithm 1 Second-order MAML Algorithm 2 First-order MAML Algorithm 3 Second-order MAML with the zeroing trick |
| Open Source Code | Yes | Code available at https://github.com/Iand Rover/MAML_noisy_contrasive_learner |
| Open Datasets | Yes | We conduct our experiments on the mini-Image Net dataset (Vinyals et al., 2016; Ravi & Larochelle, 2017) and the Omniglot dataset (Lake et al., 2015). |
| Dataset Splits | Yes | For the mini-Image Net, it contains 84 84 RGB images of 100 classes from the Image Net dataset with 600 samples per class. We split the dataset into 64, 16 and 20 classes for training, validation, and testing as proposed in (Ravi & Larochelle, 2017). For Omniglot... The dataset set is splitted into training (1028 classes), validation (172 classes) and testing (423 classes) sets (Vinyals et al., 2016). |
| Hardware Specification | Yes | Each experiment is run on either a single NVIDIA 1080-Ti or V100 GPU. |
| Software Dependencies | No | The paper mentions that implementation is based on other works (Long (2018) and Deleu (2020)) but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The models are trained with the softmax cross entropy loss function using the Adam optimizer with an outer loop learning rate of 0.001 (Antoniou et al., 2019). The inner loop step size η is set to 0.01. The models are trained for 30000 iterations (Raghu et al., 2020). The inner loop learning rate η is 0.4. The models are trained for 3000 iterations using FOMAML or SOMAML. |