Meta-Learning with Shared Amortized Variational Inference

Authors: Ekaterina Iakovleva, Jakob Verbeek, Karteek Alahari

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on the mini Image Net, CIFAR-FS and FC100 datasets, and present results demonstrating its advantages over previous work. Experiments on few-shot image classification using the mini Image Net, CIFAR-FS and FC100 datasets confirm these findings, and we observe improved accuracy using the variational approach to train the VERSA model (Gordon et al., 2019).
Researcher Affiliation Collaboration 1Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, 38000 Grenoble, France. 2Facebook Artificial Intelligence Research, Work done while Jakob Verbeek was at Inria.
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes We provide implementaion of our method at: https://github.com/katafeya/samovar.
Open Datasets Yes Mini Image Net (Vinyals et al., 2016) consists of 100 classes selected from ILSVRC-12 (Russakovsky et al., 2015). FC100 (Oreshkin et al., 2018) was derived from CIFAR-100 (Krizhevsky, 2009). CIFAR-FS (Bertinetto et al., 2019) is another meta-learning dataset derived from CIFAR-100.
Dataset Splits Yes Mini Image Net... We follow the split from Ravi & Larochelle (2017) with 64 meta-train, 16 meta-validation and 20 meta-test classes, and 600 images in each class. FC100... There are 60 meta-train classes from 12 superclasses, 20 meta-validation, and meta-test classes, each from four corresponding superclasses. CIFAR-FS... It was created by a random split into 64 meta-train, 16 meta-validation and 20 meta-test classes.
Hardware Specification No The paper does not explicitly mention the specific hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No The paper mentions using 'the code provided by Gordon et al. (2019)' but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For a fair comparison with VERSA (Gordon et al., 2019), we follow the same experimental setup, including the network architectures, optimization procedure, and episode sampling. In particular, we use the shallow CONV-5 feature extractor. In other experiments we use Res Net-12 backbone feature extractor (Oreshkin et al., 2018; Mishra et al., 2018). The cosine classifier is scaled by setting α to 25 when data augmentation is not used, and 50 otherwise. The main and auxiliary tasks are trained concurrently: in episode t out of T, the auxiliary task is sampled with probability ρ = 0.9 12t/T .