Meta-Learning for Semi-Supervised Few-Shot Classification
Authors: Mengye Ren, Eleni Triantafillou, Sachin Ravi, Jake Snell, Kevin Swersky, Joshua B. Tenenbaum, Hugo Larochelle, Richard S. Zemel
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate these methods on versions of the Omniglot and mini Image Net benchmarks, adapted to this new framework augmented with unlabeled examples. We also propose a new split of Image Net, consisting of a large set of classes, with a hierarchical structure. Our experiments confirm that our Prototypical Networks can learn to improve their predictions due to unlabeled examples, much like a semi-supervised algorithm would. |
| Researcher Affiliation | Collaboration | University of Toronto, Princeton University, Google Brain, MIT, CIFAR, Vector Institute |
| Pseudocode | No | The paper describes algorithms using mathematical equations and textual explanations, but it does not include formal pseudocode blocks or algorithm listings. |
| Open Source Code | Yes | Code available at https://github.com/renmengye/few-shot-ssl-public |
| Open Datasets | Yes | Omniglot (Lake et al., 2011) is a dataset of 1,623 handwritten characters from 50 alphabets." "mini Image Net (Vinyals et al., 2016) is a modified version of the ILSVRC-12 dataset (Russakovsky et al., 2015)" "tiered Image Net is our proposed dataset for few-shot classification. ... The full list of classes per category will also be made public |
| Dataset Splits | Yes | Omniglot... These are split into 4,112 training classes, 688 validation classes, and 1,692 testing classes." "mini Image Net... These splits use 64 classes for training, 16 for validation, and 20 for test." "tiered Image Net... These are split into 20 training, 6 validation and 8 testing categories" and "Table 4: Statistics of the tiered Image Net dataset. Train Val Test Total Categories 20 6 8 34 Classes 351 97 160 608 Images 448,695 124,261 206,209 779,165 |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions using ADAM for optimization but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | For Omniglot, ... the learning rate was set to 1e-3, and cut in half every 2K updates starting at update 2K. We trained for a total of 20K updates. For mini Imagenet and tiered Image Net, we trained with a starting learning rate of 1e-3, which we also decayed. We started the decay after 25K updates, and every 25K updates thereafter we cut it in half. We trained for a total of 200K updates. ... For the MLP used in the Masked Soft k-Means model, we use a single hidden layer with 20 hidden units with a tanh non-linearity for all 3 datasets. |