Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards Non-Saturating Recurrent Units for Modelling Long-Term Dependencies
Authors: Sarath Chandar, Chinnadhurai Sankar, Eugene Vorontsov, Samira Ebrahimi Kahou, Yoshua Bengio3280-3287
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In a series of synthetic and real world tasks, we demonstrate that the proposed model is the only model that performs among the top 2 models across all tasks with and without long-term dependencies, when compared against a range of other architectures. |
| Researcher Affiliation | Collaboration | 1Mila, Universit e de Montr eal 2Google Brain 3Microsoft Research |
| Pseudocode | No | The paper describes the model mathematically using equations (9-17) but does not include structured pseudocode or an algorithm block. |
| Open Source Code | Yes | The code for NRU Cell is available at https://github. com/apsarath/NRU. |
| Open Datasets | Yes | character level language modelling with the Penn Treebank Corpus (PTB) Marcus, Santorini, and Marcinkiewicz (1993). |
| Dataset Splits | Yes | All models were trained for 20 epochs and evaluated on the test set after selecting for each the model state which yields the lowest BPC on the validation set. |
| Hardware Specification | No | No specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running experiments are provided. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' but does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | Yes | We used the Adam optimizer Kingma and Ba (2014) with a default learning rate of 0.001 in all our experiments. We clipped the gradients by norm value of 1 for all models except GORU and EURNN since their transition operators do not expand norm. We used a batch size of 10 for most tasks, unless otherwise stated. |