Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Hierarchical Visuomotor Control of Humanoids
Authors: Josh Merel, Arun Ahuja, Vu Pham, Saran Tunyasuvunakool, Siqi Liu, Dhruva Tirumala, Nicolas Heess, Greg Wayne
ICLR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 EXPERIMENTS We compared the various approaches on a variety of tasks implemented in Mu Jo Co (Todorov et al., 2012). The core tasks we considered for the main comparisons were Go-to-target, wall navigation (Walls), running on gapped platforms (Gaps), foraging for colored ball rewards (Forage), and a foraging task requiring the agent to remember the reward value of the different colored balls (Heterogeneous Forage) (see Fig. 5). |
| Researcher Affiliation | Industry | Deep Mind London, UK EMAIL |
| Pseudocode | No | The paper does not contain explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that source code for the described methodology is publicly available, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | The data used in this project was obtained from mocap.cs.cmu.edu. The database was created with funding from NSF EIA-019621. |
| Dataset Splits | No | The paper describes training using a replay buffer and distributed actors, but does not specify explicit training, validation, and test dataset splits with percentages or sample counts. |
| Hardware Specification | Yes | The learner ran on a single Pascal 100 or Volta 100 GPU. |
| Software Dependencies | No | The paper mentions software components like 'Adam' and 'Mu Jo Co' but does not provide specific version numbers for these or other key software dependencies. |
| Experiment Setup | Yes | Task Parameters unroll LSTM state size value MLP gamma replay size Go To Target 10 128 (128, 1) 0.99 106 Walls / Gaps 20 128 (128, 1) 0.99 104 Forage 50 256 (200, 200, 1) 0.995 104 Heterogeneous Forage 200 256 (200, 200, 1) 0.99 105 Table 1: Parameters for training the agent on different environments/tasks. |