reproducibilityindex.ai

Mastering Atari with Discrete World Models

Authors: Danijar Hafner, Timothy P Lillicrap, Mohammad Norouzi, Jimmy Ba

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate Dreamer V2 on the well-established Atari benchmark with sticky actions, comparing to four strong model-free algorithms.
Researcher Affiliation	Collaboration	Danijar Hafner Google Research Timothy Lillicrap Deep Mind Mohammad Norouzi Google Research Jimmy Ba University of Toronto
Pseudocode	Yes	Algorithm 1: Straight-Through Gradients with Automatic Differentiation
Open Source Code	Yes	Refer to the project website for videos, the source code, and training curves in JSON format.1 https://danijar.com/dreamerv2
Open Datasets	Yes	We evaluate Dreamer V2 on the well-established Atari benchmark with sticky actions, comparing to four strong model-free algorithms.
Dataset Splits	No	No explicit mention of training/validation/test dataset splits with percentages or sample counts. The paper describes data generation through interaction with the Atari environment, not pre-split datasets.
Hardware Specification	Yes	Our implementation of Dreamer V2 reaches 200M environment steps in under 10 days, while using only a single NVIDIA V100 GPU and a single environment instance.
Software Dependencies	No	The paper mentions software components like "Adam optimizer" and "ELU activation function" but does not provide specific version numbers for any software libraries or frameworks used.
Experiment Setup	Yes	Table D1: Atari hyper parameters of Dreamer V2. When tuning the agent for a new task, we recommend searching over the KL loss scale β {0.1, 0.3, 1, 3}, actor entropy loss scale η {3e-5, 1e-4, 3e-4, 1e-3}, and the discount factor γ {0.99, 0.999}.