reproducibilityindex.ai

Hierarchical Visuomotor Control of Humanoids

Authors: Josh Merel, Arun Ahuja, Vu Pham, Saran Tunyasuvunakool, Siqi Liu, Dhruva Tirumala, Nicolas Heess, Greg Wayne

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3 EXPERIMENTS We compared the various approaches on a variety of tasks implemented in Mu Jo Co (Todorov et al., 2012). The core tasks we considered for the main comparisons were Go-to-target, wall navigation (Walls), running on gapped platforms (Gaps), foraging for colored ball rewards (Forage), and a foraging task requiring the agent to remember the reward value of the different colored balls (Heterogeneous Forage) (see Fig. 5).
Researcher Affiliation	Industry	Deep Mind London, UK {jsmerel,arahuja,vuph,stunya,liusiqi,dhruvat, heess,gregwayne}@google.com
Pseudocode	No	The paper does not contain explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not explicitly state that source code for the described methodology is publicly available, nor does it provide a direct link to a code repository.
Open Datasets	Yes	The data used in this project was obtained from mocap.cs.cmu.edu. The database was created with funding from NSF EIA-019621.
Dataset Splits	No	The paper describes training using a replay buffer and distributed actors, but does not specify explicit training, validation, and test dataset splits with percentages or sample counts.
Hardware Specification	Yes	The learner ran on a single Pascal 100 or Volta 100 GPU.
Software Dependencies	No	The paper mentions software components like 'Adam' and 'Mu Jo Co' but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup	Yes	Task Parameters unroll LSTM state size value MLP gamma replay size Go To Target 10 128 (128, 1) 0.99 106 Walls / Gaps 20 128 (128, 1) 0.99 104 Forage 50 256 (200, 200, 1) 0.995 104 Heterogeneous Forage 200 256 (200, 200, 1) 0.99 105 Table 1: Parameters for training the agent on different environments/tasks.