Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Exploring the Promise and Limits of Real-Time Recurrent Learning
Authors: Kazuki Irie, Anand Gopalakrishnan, Jรผrgen Schmidhuber
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we present the main experiments of this work: RL in POMDPs using realistic game environments requiring memory. Other synthetic/diagnostic experiments are reported in Appendix B.1/B.2. DMLab Memory Tasks. DMLab-30 (Beattie et al., 2016) is a collection of 30 first-person 3D game environments, with a mix of both memory and reactive tasks. Here we focus on two well-known environments, rooms select nonmatching object and rooms watermaze, which are both categorised as memory tasks according to Parisotto et al. (2020). |
| Researcher Affiliation | Academia | Kazuki Irie1 Anand Gopalakrishnan2 J urgen Schmidhuber2,3 1Center for Brain Science, Harvard University, Cambridge, MA, USA 2The Swiss AI Lab, IDSIA, USI & SUPSI, Lugano, Switzerland 3AI Initiative, KAUST, Thuwal, Saudi Arabia EMAIL, EMAIL |
| Pseudocode | No | The paper provides mathematical derivations and descriptions of algorithms but does not contain a dedicated pseudocode or algorithm block. |
| Open Source Code | Yes | 1Our code is public: https://github.com/IDSIA/rtrl-elstm |
| Open Datasets | Yes | DMLab-30 (Beattie et al., 2016), Proc Gen (Cobbe et al., 2020), and Atari 2600 (Bellemare et al., 2013) environments. |
| Dataset Splits | No | The paper describes evaluation on 'test episodes' and pre-training steps, but does not explicitly provide details about distinct training, validation, and test dataset splits with percentages or sample counts. |
| Hardware Specification | Yes | Each run requires about a day of training on a V100 GPU (this is also the case with Proc Gen and Atari). |
| Software Dependencies | No | The paper mentions 'Py Torch code' and 'torchbeast (K uttler et al., 2019)' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Table 4: Hyper-parameters for RL experiments. Parameters at the bottom are common to all settings (which are essentially taken from the Atari configuration of Espeholt et al. (2018)). |