Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Training Recurrent Neural Networks via Forward Propagation Through Time
Authors: Anil Kag, Venkatesh Saligrama
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically FPTT outperforms BPTT on a number of well-known benchmark tasks, thus enabling architectures like LSTMs to solve long range dependencies problems. ... We then conduct a number of experiments on benchmark datasets and show that our proposed method is particularly effective on tasks that exhibit long-range dependencies. |
| Researcher Affiliation | Academia | Anil Kag 1 Venkatesh Saligrama 1 1Department of Electrical and Computer Engineering, Boston University, USA. Correspondence to: Anil Kag <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Training RNN with Back Prop" and "Algorithm 2 Training RNN with FPTT. |
| Open Source Code | Yes | We have released our implementation at https://github.com/anilkagak2/FPTT |
| Open Datasets | Yes | The benchmark datasets used in this study are publicly available along with a train and test split. ... We perform experiments on three variants of the sequence-to-sequence benchmark Penn Tree Bank (PTB) dataset (Mc Auley & Leskovec, 2013). ... Pixel & Permute MNIST, CIFAR-10 are sequential variants of the popular image classification datasets: MNIST (Lecun et al., 1998) and CIFAR-10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | Yes | For hyper-parameter tuning, we set aside a validation set on tasks where a validation set is not available. ... The benchmark datasets used in this study are publicly available along with a train and test split. |
| Hardware Specification | Yes | We perform our experiments on single GTX 1080 Ti GPU. |
| Software Dependencies | No | The paper states 'We implement FPTT in Pytorch using the pseudo code given by Algorithm 2,' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For fair comparison, following previous works(Zhang et al., 2018; Kusupati et al., 2018; Kag et al., 2020), we use LSTMs with 128 dimensional hidden state and Adam as the choice of optimizer with initial learning rate 1e 3 for both algorithms. ... For the Add Task... a train batch size of 128 is presented to the RNN to update its parameters and evaluated using an independently drawn test set. |