Meta-learning with differentiable closed-form solvers
Authors: Luca Bertinetto, Joao F. Henriques, Philip Torr, Andrea Vedaldi
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the strength of our approach by performing extensive experiments on Omniglot (Lake et al., 2015), CIFAR-100 (Krizhevsky & Hinton, 2009) (adapted to the few-shot problem) and mini Image Net (Vinyals et al., 2016). Our base learners are fast, simple to implement, and can achieve performance that is competitive with or superior to the state of the art in terms of accuracy. |
| Researcher Affiliation | Collaboration | Luca Bertinetto João Henriques Five AI & University of Oxford University of Oxford luca@robots.ox.ac.uk joao@robots.ox.ac.uk Philip H.S. Torr Andrea Vedaldi Five AI & University of Oxford University of Oxford philip.torr@eng.ox.ac.uk vedaldi@robots.ox.ac.uk |
| Pseudocode | No | The paper describes methods like ridge regression and IRLS mathematically and textually (e.g., in Section 3.2 and 3.3) but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code for both our methods and the splits of CIFAR-FS are available at http://www.robots.ox.ac.uk/~luca/r2d2.html. |
| Open Datasets | Yes | We analyze their performance against the recent literature on multi-class and binary classification problems using three few-shot learning benchmarks: Omniglot (Lake et al., 2015), mini Image Net (Vinyals et al., 2016) and CIFAR-FS, which we introduce in this paper. The code for both our methods and the splits of CIFAR-FS are available at http://www.robots.ox.ac.uk/~luca/r2d2.html. |
| Dataset Splits | Yes | Including rotations, we use 4800 classes for meta-training and meta-validation and 1692 for meta-testing. ... As all recent work, we adopt the same splits of Ravi & Larochelle (2017), who employ 64 classes for meta-training, 16 for meta-validation and 20 for meta-testing. |
| Hardware Specification | Yes | In Table 4 we compare the amount of time required by two representative methods and ours to solve 10,000 episodes (each with 10 images) on a single NVIDIA GTX 1080 GPU. |
| Software Dependencies | No | The paper mentions using Adam (Kingma & Ba, 2015) for optimization and 'standard automatic differentiation packages', but does not provide specific version numbers for these or any other software dependencies such as Python, PyTorch/TensorFlow, or other libraries. |
| Experiment Setup | Yes | At the meta-learning level, we train our methods with Adam (Kingma & Ba, 2015) with an initial learning rate of 0.005, dampened by 0.5 every 2,000 episodes. Training is stopped when the error on the meta-validation set does not decrease meaningfully for 20,000 episodes. ... The four convolutional layers have [96, 192, 384, 512] filters. Dropout is applied to the last two blocks for the experiments on mini Image Net and CIFAR-FS, respectively with probabilities 0.1 and 0.4. |