Memory-Efficient Backpropagation Through Time

Authors: Audrunas Gruslys, Remi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We used an LSTM mapping 256 inputs to 256 with a batch size of 64 and measured execution time for a single gradient descent step (forward and backward operation combined) as a function of sequence length (Figure 2(b)).
Researcher Affiliation Industry Audr unas Gruslys Google Deep Mind audrunas@google.com Rémi Munos Google Deep Mind munos@google.com Ivo Danihelka Google Deep Mind danihelka@google.com Marc Lanctot Google Deep Mind lanctot@google.com Alex Graves Google Deep Mind gravesa@google.com
Pseudocode Yes Pseudocode is given in the supplementary material.
Open Source Code No The paper states 'Pseudocode is given in the supplementary material' but does not provide concrete access (link, explicit statement of public release) to the source code for the methodology.
Open Datasets No The paper mentions using 'an LSTM mapping 256 inputs to 256' but does not specify a publicly available or open dataset name, link, DOI, or formal citation.
Dataset Splits No The paper does not provide specific dataset split information (percentages, sample counts, or detailed methodology) needed to reproduce data partitioning.
Hardware Specification No The paper mentions 'Graphics Processing Units (GPUs)' in general but does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers, needed to replicate the experiment.
Experiment Setup Yes We used an LSTM mapping 256 inputs to 256 with a batch size of 64 and measured execution time for a single gradient descent step (forward and backward operation combined) as a function of sequence length (Figure 2(b)).