Hyperbolic Deep Reinforcement Learning
Authors: Edoardo Cetin, Benjamin Paul Chamberlain, Michael M. Bronstein, Jonathan J Hunt
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our framework by applying it to popular on-policy and off-policy RL algorithms on the Procgen and Atari 100K benchmarks, attaining near universal performance and generalization benefits. |
| Researcher Affiliation | Collaboration | Edoardo Cetin King s College London Benjamin P Chamberlain Charm Therapeutics Michael M Bronstein University of Oxford Jonathan J Hunt Twitter Inc. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | To this end, we share our implementation at sites.google.com/view/hyperbolic-rl. |
| Open Datasets | Yes | Procgen. The Procgen generalization benchmark (Cobbe et al., 2020) consists of 16 game environments with procedurally-generated random levels... Atari 100K benchmark (Bellemare et al., 2013; Kaiser et al., 2020) is based on the seminal problems from the Atari Learning Environment (Bellemare et al., 2013). |
| Dataset Splits | No | The paper describes training on the 'first 200 levels' and evaluating on the 'full distribution of levels' for Procgen, which is a train/test split. However, it does not explicitly mention a separate validation split for the main experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions software like PyTorch, Geoopt, and Hydra, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | In Table 4 we provide further details of our PPO hyper-parameters, as also described by the original Procgen paper (Cobbe et al., 2020). and In Table 5 we provide details of our Rainbow DQN hyper-parameters. |