Neural Random-Access Machines

Authors: Karol Kurach, Marcin Andrychowicz, Ilya Sutskever

ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, we evaluate our model on several algorithmic problems whose solutions required pointer manipulation and chasing. Our results show that the proposed model can learn to solve algorithmic tasks of such type and is capable of operating on simple data structures like linked-lists and binary trees. For easier tasks, the learned solutions generalize to sequences of arbitrary length.
Researcher Affiliation Industry Karol Kurach & Marcin Andrychowicz & Ilya Sutskever Google {kkurach,marcina,ilyasu}@google.com
Pseudocode No The paper describes the model architecture and operations using prose and mathematical equations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing open-source code or a link to a code repository for the described methodology.
Open Datasets No The paper describes the generation of synthetic datasets for each task (e.g., "For every task, the input is given to the network in the memory tape", "Elements are in random locations in the memory" for List K), but it does not provide concrete access information (link, DOI, formal citation) to a publicly available, pre-existing dataset.
Dataset Splits No The paper describes a curriculum learning training procedure and mentions testing, but it does not explicitly provide details of standard training/validation/test dataset splits with percentages, sample counts, or predefined splits for a fixed dataset.
Hardware Specification No The paper does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory) used for conducting the experiments.
Software Dependencies No The paper mentions using the Adam optimization algorithm and ReLU nonlinearity but does not specify versions for any software libraries, frameworks, or programming languages used in the implementation.
Experiment Setup Yes The NRAM model is fully differentiable and we trained it using the Adam optimization algorithm (Kingma & Ba, 2014) with the negative log-likelihood cost function. We used multilayer perceptrons (MLPs) with two hidden layers or LSTMs with a hidden layer between input and LSTM cells as controllers. The Re Lu nonlinearity (Nair & Hinton, 2010) was used in all experiments. We used Curriculum learning, Gradient clipping, Noise (random Gaussian noise to gradients decaying exponentially), Enforcing Distribution Constraints (rescaling values), and Entropy bonus (decreasing over time) during training. We also used log(max(x, ϵ)) for logarithm computations.