Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

Authors: Jack Rae, Jonathan J. Hunt, Ivo Danihelka, Timothy Harley, Andrew W. Senior, Gregory Wayne, Alex Graves, Timothy Lillicrap

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To test whether the model is able to learn with this sparse approximation, we examined its performance on a selection of synthetic and natural tasks: algorithmic tasks from the NTM work [7], Babi reasoning tasks used with Memory Networks [17] and Omniglot one-shot classification [16, 12].
Researcher Affiliation Industry Jonathan J Hunt Ivo Danihelka Andrew Senior andrewsenior Greg Wayne gregwayne Alex Graves Timothy P Lillicrap Google Deep Mind @google.com
Pseudocode No The paper describes algorithmic details of SAM and its components (e.g., sparse read, write, memory management, ANN), but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code No The paper does not contain an explicit statement about releasing its source code, nor does it provide a link to a code repository for the methodology described.
Open Datasets Yes We examined its performance on a selection of synthetic and natural tasks: algorithmic tasks from the NTM work [7], Babi reasoning tasks used with Memory Networks [17] and Omniglot one-shot classification [16, 12].
Dataset Splits No The paper mentions that 'a validation task with 500 characters was used to select the best run' for the Omniglot task, but it does not provide specific details on dataset split percentages or sample counts for training, validation, or test sets across all experiments.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory specifications) used to run its experiments.
Software Dependencies No The paper mentions using 'FLANN s randomized k-d tree implementation [15]' and 'Torch7: A matlab-like environment for machine learning [5]' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup No The paper states, 'Full hyperparameter details are in Supplementary C,' implying that these details are not provided in the main text. The main body describes the model architecture and training components but lacks specific hyperparameter values or concrete training configurations.