Neural Random-Access Machines
Authors: Karol Kurach, Marcin Andrychowicz, Ilya Sutskever
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we evaluate our model on several algorithmic problems whose solutions required pointer manipulation and chasing. Our results show that the proposed model can learn to solve algorithmic tasks of such type and is capable of operating on simple data structures like linked-lists and binary trees. For easier tasks, the learned solutions generalize to sequences of arbitrary length. |
| Researcher Affiliation | Industry | Karol Kurach & Marcin Andrychowicz & Ilya Sutskever Google {kkurach,marcina,ilyasu}@google.com |
| Pseudocode | No | The paper describes the model architecture and operations using prose and mathematical equations but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing open-source code or a link to a code repository for the described methodology. |
| Open Datasets | No | The paper describes the generation of synthetic datasets for each task (e.g., "For every task, the input is given to the network in the memory tape", "Elements are in random locations in the memory" for List K), but it does not provide concrete access information (link, DOI, formal citation) to a publicly available, pre-existing dataset. |
| Dataset Splits | No | The paper describes a curriculum learning training procedure and mentions testing, but it does not explicitly provide details of standard training/validation/test dataset splits with percentages, sample counts, or predefined splits for a fixed dataset. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., GPU/CPU models, memory) used for conducting the experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimization algorithm and ReLU nonlinearity but does not specify versions for any software libraries, frameworks, or programming languages used in the implementation. |
| Experiment Setup | Yes | The NRAM model is fully differentiable and we trained it using the Adam optimization algorithm (Kingma & Ba, 2014) with the negative log-likelihood cost function. We used multilayer perceptrons (MLPs) with two hidden layers or LSTMs with a hidden layer between input and LSTM cells as controllers. The Re Lu nonlinearity (Nair & Hinton, 2010) was used in all experiments. We used Curriculum learning, Gradient clipping, Noise (random Gaussian noise to gradients decaying exponentially), Enforcing Distribution Constraints (rescaling values), and Entropy bonus (decreasing over time) during training. We also used log(max(x, ϵ)) for logarithm computations. |