Learning a Universal Template for Few-shot Dataset Generalization
Authors: Eleni Triantafillou, Hugo Larochelle, Richard Zemel, Vincent Dumoulin
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally evaluate FLUTE on few-shot dataset generalization using the recent Meta-Dataset benchmark (Triantafillou et al., 2020) that is comprised of 10 diverse datasets, 8 of which can be used for training, with the remaining 2 reserved for evaluation. To obtain a richer set of evaluation tasks, we incorporate 3 additional evaluation-only datasets, following Requeima et al. (2019). FLUTE significantly outperforms the state-of-the-art on the challenging Meta-Dataset benchmark. ... We present these results in Table 2. |
| Researcher Affiliation | Collaboration | 1University of Toronto, Vector Institute 2Google Research, Brain Team 3Work done at Google. |
| Pseudocode | No | The paper describes the algorithm steps in narrative text and refers to Figure 1 for an illustration, but does not provide formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is publicly available and has been incorporated into the Meta-Dataset codebase1. 1https://github.com/google-research/meta-dataset |
| Open Datasets | Yes | We conduct our experimental evaluation on the recent Meta Dataset benchmark... In more detail, the training set contains classes from Image Net, Omniglot, Aircraft, Birds, Flowers, Quickdraw, Fungi, and Textures... (Triantafillou et al., 2020) |
| Dataset Splits | No | The paper states that "a disjoint validation set of classes Cval is also available for model selection" within the context of few-shot classification and Meta-Dataset, but does not provide specific split percentages or counts for this validation set, nor does it explicitly detail how this validation set is derived or used for specific reproduction beyond general model selection. |
| Hardware Specification | No | The paper does not mention any specific hardware used for running the experiments (e.g., GPU models, CPU types, or cloud compute instance specifications). |
| Software Dependencies | No | The paper mentions software components like Res Net-18, cosine classifier, stochastic gradient descent with momentum, and Adam, but does not provide specific version numbers for any software, libraries, or frameworks (e.g., Python, TensorFlow, PyTorch versions). |
| Experiment Setup | Yes | During training of our universal template, we treat each dataset-specific readout head rm as a cosine classifier... We use stochastic gradient descent with momentum as the optimizer for this phase, with a cosine decay with restarts schedule for the learning rate. ... We use Adam for this phase. ... We used 6 steps for this, with a learning rate of 0.005, which are the values we used for FLUTE s results in Table 2, chosen based on the validation set. |