Optimization as a Model for Few-Shot Learning

Authors: Sachin Ravi, Hugo Larochelle

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that this meta-learning model is competitive with deep metric-learning techniques for few-shot learning.In this section, we describe the results of experiments, examining the properties of our model and comparing our method s performance against different approaches .
Researcher Affiliation Collaboration Sachin Ravi and Hugo Larochelle Twitter, Cambridge, USA {sachinr,hugo}@twitter.comWork done as an intern at Twitter. Sachin is a Ph D student at Princeton University and can be reached at sachinr@princeton.edu.
Pseudocode Yes Algorithm 1 Train Meta-Learner
Open Source Code Yes Code can be found at https://github.com/twitter/meta-learning-lstm.
Open Datasets Yes The Mini-Image Net dataset was proposed by Vinyals et al. (2016) as a benchmark offering the challenges of the complexity of Image Net images, without requiring the resources and infrastructure necessary to run on the full Image Net dataset.
Dataset Splits Yes We use 64, 16, and 20 classes for training, validation and testing, respectively.
Hardware Specification No The paper does not provide specific hardware details (like GPU/CPU models or processor types) used for running its experiments.
Software Dependencies No The paper mentions using ADAM for optimization but does not provide specific version numbers for software dependencies like programming languages or libraries.
Experiment Setup Yes For the learner, we use a simple CNN containing 4 convolutional layers, each of which is a 3 3 convolution with 32 filters, followed by batch normalization, a Re LU non-linearity, and lastly a 2 2 max-pooling. The network then has a final linear layer followed by a softmax for the number of classes being considered. We train our LSTM with ADAM using a learning rate of 0.001 and with gradient clipping using a value of 0.25.