Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes
Authors: Jack Rae, Jonathan J. Hunt, Ivo Danihelka, Timothy Harley, Andrew W. Senior, Gregory Wayne, Alex Graves, Timothy Lillicrap
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To test whether the model is able to learn with this sparse approximation, we examined its performance on a selection of synthetic and natural tasks: algorithmic tasks from the NTM work [7], Babi reasoning tasks used with Memory Networks [17] and Omniglot one-shot classification [16, 12]. |
| Researcher Affiliation | Industry | Jonathan J Hunt Ivo Danihelka Andrew Senior andrewsenior Greg Wayne gregwayne Alex Graves Timothy P Lillicrap Google Deep Mind @google.com |
| Pseudocode | No | The paper describes algorithmic details of SAM and its components (e.g., sparse read, write, memory management, ANN), but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing its source code, nor does it provide a link to a code repository for the methodology described. |
| Open Datasets | Yes | We examined its performance on a selection of synthetic and natural tasks: algorithmic tasks from the NTM work [7], Babi reasoning tasks used with Memory Networks [17] and Omniglot one-shot classification [16, 12]. |
| Dataset Splits | No | The paper mentions that 'a validation task with 500 characters was used to select the best run' for the Omniglot task, but it does not provide specific details on dataset split percentages or sample counts for training, validation, or test sets across all experiments. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used to run its experiments. |
| Software Dependencies | No | The paper mentions using 'FLANN s randomized k-d tree implementation [15]' and 'Torch7: A matlab-like environment for machine learning [5]' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | No | The paper states, 'Full hyperparameter details are in Supplementary C,' implying that these details are not provided in the main text. The main body describes the model architecture and training components but lacks specific hyperparameter values or concrete training configurations. |