Lie-Access Neural Turing Machines

Authors: Greg Yang, Alexander Rush

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To experiment with this approach, we implement a simplified Lie-access neural Turing machine (LANTM) with different Lie groups. We find that this approach is able to perform well on a range of algorithmic tasks. 5 EXPERIMENTS
Researcher Affiliation Academia Greg Yang and Alexander M. Rush {gyang@college,srush@seas}.harvard.edu Harvard University Cambridge, MA 02138, USA
Pseudocode No The paper describes procedures and models using text and mathematical equations, but does not include any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code Yes 1Our implementations are available at https://github.com/harvardnlp/lie-access-memory
Open Datasets No The paper uses custom-designed algorithmic tasks with randomly generated examples, and does not provide a link, DOI, repository, or formal citation to a publicly available or open dataset.
Dataset Splits No The paper describes how training and test data are generated based on sequence length, but does not explicitly mention a validation set or provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits).
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or detailed computer specifications used for running experiments.
Software Dependencies No The paper mentions 'LSTM' and 'torch' (implied by initialization details), but does not provide specific version numbers for software dependencies like Python, PyTorch, or CUDA.
Experiment Setup Yes Setup. Our experiments utilize an LSTM controller in a version of the encoder-decoder setup... Model Setup. For all tasks, the LSTM baseline has 1 to 4 layers, each with 256 cells. Each of the other models has a single-layer, 50-cell LSTM controller, with memory width (i.e. the size of each memory vector) 20. Other parameters such as learning rate, decay, and intialization are found through grid search. Further hyperparameter details are give in the appendix. Table A.1: Parameter grid for grid search.