Any-Way Meta Learning

Authors: JunHoo Lee, Yearim Kim, Hyunho Lee, Nojun Kwak

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments conducted on renowned architectures like MAML and Proto Net affirm the effectiveness of our method.
Researcher Affiliation Academia Seoul National University {mrjunoo, yerim1656, hhlee822, nojunk} @ snu.ac.kr
Pseudocode Yes Algorithm 1: Any-Way Meta-Learning
Open Source Code No The paper mentions implementing using the 'torchmeta library' and references 'https://github.com/tristandeleu/pytorch-meta', but does not explicitly state that their own code for the proposed method is open-source or provide a link to it.
Open Datasets Yes Datasets We evaluated our methodology on a diverse range of datasets. The general datasets, denoted as G , such as Mini Image Net (Vinyals et al. 2016) and Tiered Image Net (Ren et al. 2018) (subsets of the more extensive Image Net (Russakovsky et al. 2015) with 100 and 400 classes respectively) serve as versatile bases for broader tasks. In contrast, the Cars (Krause et al. 2013) and CUB (Welinder et al. 2010) datasets, representing specific datasets or S , are widely used for fine-grained image classification evaluations due to their focus on closely related objects with subtle variations.
Dataset Splits No The paper mentions 'validation accuracy' and 'test phase' and refers to standard datasets, but does not provide specific details on how these datasets were split into training, validation, and test sets (e.g., percentages or exact sample counts for each split).
Hardware Specification Yes Environments We implemented MAML and Proto Net using torchmeta library (Deleu et al. 2019), using singe A100 GPU.
Software Dependencies No The paper mentions using the 'torchmeta library' but does not specify its version number, nor does it list versions for other key software components like Python or PyTorch.
Experiment Setup Yes Hyperparameters In line with (Oh et al. 2020), our experiments involved sampling 60,000 episodes. We adopted the 4-conv architecture as detailed in (Vinyals et al. 2016). The learning rates were set at 0.5 for the inner loop and 0.001 for the outer loop. Depending on the shot type, the λ values were adjusted: 0.1 for 1-shot and 0.5 for 5-shot for MAML. And 0.1 for 5-shot and 0.01 for 1-shot in Proto Net. For the mixup showcase, we followed the convention of sampling the mixup rate from a beta distribution with α = β = 0.5. When adopting mixup, we assign a separate numeric label to the mixed inputs. Also, when constructing prototye vectors, the EMA (exponential moving average) rate was set to 0.05 for 5-shot and 0.01 for 1-shot.