reproducibilityindex.ai

Deep Hierarchical Planning from Pixels

Authors: Danijar Hafner, Kuang-Huei Lee, Ian Fischer, Pieter Abbeel

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Director on two challenging benchmark suites with visual inputs and very sparse rewards, which we expect to be challenging to solve using a flat policy without hierarchy (Section 3.1). We further evaluate Director on a wide range of standard tasks from the literature to demonstrate its generality and ensure that the hierarchy is not harmful in simple settings (Section 3.2).
Researcher Affiliation	Collaboration	1UC Berkeley 2Google Research 3University of Toronto 4Covariant
Pseudocode	Yes	For the pseudo code of Director, refer to Appendix E.
Open Source Code	Yes	Project website with videos and code: https://danijar.com/director All our agents and environments will be open sourced upon publication to facilitate future research in hierarchical reinforcement learning.
Open Datasets	Yes	We choose Atari games (Bellemare et al., 2013), the Control Suite from pixels (Tassa et al., 2018), Crafter (Hafner, 2021), and tasks from DMLab (Beattie et al., 2016) to cover a spectrum of challenges, including continuous and discrete actions and 2D and 3D environments.
Dataset Splits	No	The paper does not explicitly provide numerical training, validation, or test dataset splits (e.g., 80/10/10%). It mentions that the world model is trained from a replay buffer and policies from imagined rollouts, and discusses evaluation on benchmarks, but no specific dataset partitioning details for reproducibility are given.
Hardware Specification	Yes	Each training run used a single V100 GPU with XLA and mixed precision enabled and completed in less than 24 hours.
Software Dependencies	No	We implemented Director on top of the public source code of Dreamer V2 (Hafner et al., 2020a), reusing its default hyperparameters. The paper mentions "Dreamer V2" but does not specify a version number or list other software dependencies with their respective version numbers.
Experiment Setup	Yes	We use a fixed set of hyperparameters not only across tasks but also across domains, detailed in Table F.1.