Learning Awareness Models

Authors: Brandon Amos, Laurent Dinh, Serkan Cabi, Thomas Rothörl, Sergio Gómez Colmenarejo, Alistair Muldal, Tom Erez, Yuval Tassa, Nando de Freitas, Misha Denil

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our dynamics model is able to successfully predict distributions over 132 sensor readings over 100 steps into the future and we demonstrate that even when the body is no longer in contact with an object, the latent variables of the dynamics model continue to represent its shape. We show that active data collection by maximizing the entropy of predictions about the body touch sensors, proprioception and vestibular information leads to learning of dynamic models that show superior performance when used for control. We also collect data from a real robotic hand and show that the same models can be used to answer questions about properties of objects in the real world.
Researcher Affiliation Collaboration Brandon Amos1 Laurent Dinh2 Serkan Cabi3 Thomas Roth orl3 Sergio G omez Colmenarejo3 Alistair Muldal3 Tom Erez3 Yuval Tassa3 Nando de Freitas3,4 Misha Denil3 1Carnegie Mellon University 2University of Montreal 3Deep Mind 4CIFAR
Pseudocode No The paper contains architectural diagrams and descriptions of methods but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper provides a link for qualitative result videos ('https://goo.gl/m Zuq AV') but does not provide concrete access to the source code for the methodology described in the paper, nor does it explicitly state that the code is open-sourced.
Open Datasets Yes Our simulated body is a model of the hand of the Johns Hopkins Modular Prosthetic Limb (Johannes et al., 2011), realized in Mu Jo Co (Todorov et al., 2012). The hand is from the Johns Hopkins Modular Prosthetic Limb (Johannes et al., 2011) which we refer to as the MPL hand , or simply the hand . This model is distributed with the Mu Jo Co HAPTIX software and is available for download from the Mu Jo Co website.1 (footnote: http://www.mujoco.org/book/haptix.html)
Dataset Splits No For the real world data, a train/test split is mentioned: 'We use the 47 trajectories from the initial session as test data, and use the remaining 1093 trajectories for training.' However, no explicit validation split is specified for either the simulated or real-world experiments, nor are explicit train/validation/test splits provided for the simulated environment beyond mentioning '5000 test episodes'.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions software like Mu Jo Co, Adam optimizer, and the framework of Horgan et al. (2018), but does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes The full set of hyperparameters for this model can be found in Appendix D. Table 1 shows hyperparameters for several of the models used in the experiments.