Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Hyperbolic Deep Reinforcement Learning
Authors: Edoardo Cetin, Benjamin Paul Chamberlain, Michael M. Bronstein, Jonathan J Hunt
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our framework by applying it to popular on-policy and off-policy RL algorithms on the Procgen and Atari 100K benchmarks, attaining near universal performance and generalization benefits. |
| Researcher Affiliation | Collaboration | Edoardo Cetin King s College London Benjamin P Chamberlain Charm Therapeutics Michael M Bronstein University of Oxford Jonathan J Hunt Twitter Inc. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | To this end, we share our implementation at sites.google.com/view/hyperbolic-rl. |
| Open Datasets | Yes | Procgen. The Procgen generalization benchmark (Cobbe et al., 2020) consists of 16 game environments with procedurally-generated random levels... Atari 100K benchmark (Bellemare et al., 2013; Kaiser et al., 2020) is based on the seminal problems from the Atari Learning Environment (Bellemare et al., 2013). |
| Dataset Splits | No | The paper describes training on the 'first 200 levels' and evaluating on the 'full distribution of levels' for Procgen, which is a train/test split. However, it does not explicitly mention a separate validation split for the main experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions software like PyTorch, Geoopt, and Hydra, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | In Table 4 we provide further details of our PPO hyper-parameters, as also described by the original Procgen paper (Cobbe et al., 2020). and In Table 5 we provide details of our Rainbow DQN hyper-parameters. |