Matching Networks for One Shot Learning

Authors: Oriol Vinyals, Charles Blundell, Timothy Lillicrap, koray kavukcuoglu, Daan Wierstra

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section we describe the results of many experiments, comparing our Matching Networks model against strong baselines.
Researcher Affiliation Industry Oriol Vinyals Google Deep Mind vinyals@google.com Charles Blundell Google Deep Mind cblundell@google.com Timothy Lillicrap Google Deep Mind countzero@google.com Koray Kavukcuoglu Google Deep Mind korayk@google.com Daan Wierstra Google Deep Mind wierstra@google.com
Pseudocode No The paper contains mathematical equations but no explicitly labeled "Pseudocode" or "Algorithm" blocks, nor any structured code-like procedures.
Open Source Code No The paper does not provide any explicit statement about releasing source code or include any links to a code repository for the described methodology.
Open Datasets Yes We ran one-shot experiments on three data sets: two image classification sets (Omniglot [14] and Image Net [19, ILSVRC-2012]) and one language modeling (Penn Treebank).
Dataset Splits Yes Following [21], we augmented the data set with random rotations by multiples of 90 degrees and used 1200 characters for training, and the remaining character classes for evaluation. [...] We used 80 classes for training and tested on the remaining 20 classes. [...] We split the words into a randomly sampled 9000 for training and 1000 for testing, and we used the standard test set to report results.
Hardware Specification No The paper mentions 'fits in memory on modern machines' regarding the mini Image Net dataset but does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper describes models and techniques used but does not list specific software libraries or their version numbers necessary for replication.
Experiment Setup Yes We used a simple yet powerful CNN as the embedding function consisting of a stack of modules, each of which is a 3 3 convolution with 64 filters followed by batch normalization [10], a Relu non-linearity and 2 2 max-pooling. We resized all the images to 28 28 so that, when we stack 4 modules, the resulting feature map is 1 1 64, resulting in our embedding function f(x).