Learning Latent Dynamics for Planning from Pixels

Authors: Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Pla Net on six continuous control tasks from pixels.
Researcher Affiliation Collaboration 1Google Brain 2University of Toronto 3Deep Mind 4Google Research 5University of Michigan.
Pseudocode Yes Algorithm 1: Deep Planning Network (Pla Net)
Open Source Code Yes Please visit https://danijar.com/planet for access to the code and videos of the trained agent.
Open Datasets Yes For our evaluation, we consider six image-based continuous control tasks of the Deep Mind control suite (Tassa et al., 2018), shown in Figure 1.
Dataset Splits No The paper describes iterative data collection and training but does not provide specific percentages or counts for training, validation, or test dataset splits.
Hardware Specification Yes The training time of 10 to 20 hours (depending on the task) on a single Nvidia V100 GPU compares favorably to that of A3C and D4PG.
Software Dependencies No The paper states "Our implementation uses Tensor Flow Probability (Dillon et al., 2017)" but does not provide a specific version number for this or any other software dependency.
Experiment Setup Yes We refer to the appendix for hyper parameters (Appendix A) and additional experiments (Appendices C to E). Besides the action repeat, we use the same hyper parameters for all tasks.