Exploring the Promise and Limits of Real-Time Recurrent Learning
Authors: Kazuki Irie, Anand Gopalakrishnan, Jürgen Schmidhuber
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we present the main experiments of this work: RL in POMDPs using realistic game environments requiring memory. Other synthetic/diagnostic experiments are reported in Appendix B.1/B.2. DMLab Memory Tasks. DMLab-30 (Beattie et al., 2016) is a collection of 30 first-person 3D game environments, with a mix of both memory and reactive tasks. Here we focus on two well-known environments, rooms select nonmatching object and rooms watermaze, which are both categorised as memory tasks according to Parisotto et al. (2020). |
| Researcher Affiliation | Academia | Kazuki Irie1 Anand Gopalakrishnan2 J urgen Schmidhuber2,3 1Center for Brain Science, Harvard University, Cambridge, MA, USA 2The Swiss AI Lab, IDSIA, USI & SUPSI, Lugano, Switzerland 3AI Initiative, KAUST, Thuwal, Saudi Arabia kirie@fas.harvard.edu, {anand, juergen}@idsia.ch |
| Pseudocode | No | The paper provides mathematical derivations and descriptions of algorithms but does not contain a dedicated pseudocode or algorithm block. |
| Open Source Code | Yes | 1Our code is public: https://github.com/IDSIA/rtrl-elstm |
| Open Datasets | Yes | DMLab-30 (Beattie et al., 2016), Proc Gen (Cobbe et al., 2020), and Atari 2600 (Bellemare et al., 2013) environments. |
| Dataset Splits | No | The paper describes evaluation on 'test episodes' and pre-training steps, but does not explicitly provide details about distinct training, validation, and test dataset splits with percentages or sample counts. |
| Hardware Specification | Yes | Each run requires about a day of training on a V100 GPU (this is also the case with Proc Gen and Atari). |
| Software Dependencies | No | The paper mentions 'Py Torch code' and 'torchbeast (K uttler et al., 2019)' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Table 4: Hyper-parameters for RL experiments. Parameters at the bottom are common to all settings (which are essentially taken from the Atari configuration of Espeholt et al. (2018)). |