Matching Networks for One Shot Learning
Authors: Oriol Vinyals, Charles Blundell, Timothy Lillicrap, koray kavukcuoglu, Daan Wierstra
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we describe the results of many experiments, comparing our Matching Networks model against strong baselines. |
| Researcher Affiliation | Industry | Oriol Vinyals Google Deep Mind vinyals@google.com Charles Blundell Google Deep Mind cblundell@google.com Timothy Lillicrap Google Deep Mind countzero@google.com Koray Kavukcuoglu Google Deep Mind korayk@google.com Daan Wierstra Google Deep Mind wierstra@google.com |
| Pseudocode | No | The paper contains mathematical equations but no explicitly labeled "Pseudocode" or "Algorithm" blocks, nor any structured code-like procedures. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or include any links to a code repository for the described methodology. |
| Open Datasets | Yes | We ran one-shot experiments on three data sets: two image classification sets (Omniglot [14] and Image Net [19, ILSVRC-2012]) and one language modeling (Penn Treebank). |
| Dataset Splits | Yes | Following [21], we augmented the data set with random rotations by multiples of 90 degrees and used 1200 characters for training, and the remaining character classes for evaluation. [...] We used 80 classes for training and tested on the remaining 20 classes. [...] We split the words into a randomly sampled 9000 for training and 1000 for testing, and we used the standard test set to report results. |
| Hardware Specification | No | The paper mentions 'fits in memory on modern machines' regarding the mini Image Net dataset but does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper describes models and techniques used but does not list specific software libraries or their version numbers necessary for replication. |
| Experiment Setup | Yes | We used a simple yet powerful CNN as the embedding function consisting of a stack of modules, each of which is a 3 3 convolution with 64 filters followed by batch normalization [10], a Relu non-linearity and 2 2 max-pooling. We resized all the images to 28 28 so that, when we stack 4 modules, the resulting feature map is 1 1 64, resulting in our embedding function f(x). |