Hyperbolic Deep Reinforcement Learning

Authors: Edoardo Cetin, Benjamin Paul Chamberlain, Michael M. Bronstein, Jonathan J Hunt

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate our framework by applying it to popular on-policy and off-policy RL algorithms on the Procgen and Atari 100K benchmarks, attaining near universal performance and generalization benefits.
Researcher Affiliation Collaboration Edoardo Cetin King s College London Benjamin P Chamberlain Charm Therapeutics Michael M Bronstein University of Oxford Jonathan J Hunt Twitter Inc.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes To this end, we share our implementation at sites.google.com/view/hyperbolic-rl.
Open Datasets Yes Procgen. The Procgen generalization benchmark (Cobbe et al., 2020) consists of 16 game environments with procedurally-generated random levels... Atari 100K benchmark (Bellemare et al., 2013; Kaiser et al., 2020) is based on the seminal problems from the Atari Learning Environment (Bellemare et al., 2013).
Dataset Splits No The paper describes training on the 'first 200 levels' and evaluating on the 'full distribution of levels' for Procgen, which is a train/test split. However, it does not explicitly mention a separate validation split for the main experiments.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for the experiments.
Software Dependencies No The paper mentions software like PyTorch, Geoopt, and Hydra, but does not provide specific version numbers for these dependencies.
Experiment Setup Yes In Table 4 we provide further details of our PPO hyper-parameters, as also described by the original Procgen paper (Cobbe et al., 2020). and In Table 5 we provide details of our Rainbow DQN hyper-parameters.