Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Optimization as a Model for Few-Shot Learning
Authors: Sachin Ravi, Hugo Larochelle
ICLR 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that this meta-learning model is competitive with deep metric-learning techniques for few-shot learning.In this section, we describe the results of experiments, examining the properties of our model and comparing our method s performance against different approaches . |
| Researcher Affiliation | Collaboration | Sachin Ravi and Hugo Larochelle Twitter, Cambridge, USA EMAIL done as an intern at Twitter. Sachin is a Ph D student at Princeton University and can be reached at EMAIL. |
| Pseudocode | Yes | Algorithm 1 Train Meta-Learner |
| Open Source Code | Yes | Code can be found at https://github.com/twitter/meta-learning-lstm. |
| Open Datasets | Yes | The Mini-Image Net dataset was proposed by Vinyals et al. (2016) as a benchmark offering the challenges of the complexity of Image Net images, without requiring the resources and infrastructure necessary to run on the full Image Net dataset. |
| Dataset Splits | Yes | We use 64, 16, and 20 classes for training, validation and testing, respectively. |
| Hardware Specification | No | The paper does not provide specific hardware details (like GPU/CPU models or processor types) used for running its experiments. |
| Software Dependencies | No | The paper mentions using ADAM for optimization but does not provide specific version numbers for software dependencies like programming languages or libraries. |
| Experiment Setup | Yes | For the learner, we use a simple CNN containing 4 convolutional layers, each of which is a 3 3 convolution with 32 filters, followed by batch normalization, a Re LU non-linearity, and lastly a 2 2 max-pooling. The network then has a final linear layer followed by a softmax for the number of classes being considered. We train our LSTM with ADAM using a learning rate of 0.001 and with gradient clipping using a value of 0.25. |