H-GAP: Humanoid Control with a Generalist Planner

Authors: zhengyao jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For 56 degrees of freedom humanoid, we empirically demonstrate that H-GAP learns to represent and generate a wide range of motor behaviours. Further, without any learning from online interactions, it can also flexibly transfer these behaviors to solve novel downstream control tasks via planning. Notably, H-GAP excels established MPC baselines that have access to the ground truth dynamics model, and is superior or comparable to offline RL methods trained for individual tasks. Finally, we do a series of empirical studies on the scaling properties of H-GAP, showing the potential for performance gains via additional data but not computing.
Researcher Affiliation Collaboration Zhengyao Jiang *, 1, 5 , Yingchen Xu *, 1, 2, Nolan Wagener 3, Yicheng Luo 1, Michael Janner 4, Edward Grefenstette 1, Tim Rockt aschel 1, Yuandong Tian 2 1 University College London, 2 AI at Meta, 3 Georgia Institute of Technology, 4 University of California at Berkeley, 5 Weco AI
Pseudocode Yes Algorithm 1 H-GAP Model Predictive Control
Open Source Code Yes Code and videos are available at https://ycxuyingchen.github.io/hgap/.
Open Datasets Yes Data. We use the Mo Cap Act dataset (Wagener et al., 2022), which contains over 500k rollouts with a total of 67M environment transitions (corresponding to 620 hours in the simulator) from a collection of expert Mo Cap tracking policies for a Mu Jo Co-based simulated humanoid, that can faithfully track 3.5 hours of various recorded motion from CMU Mo Cap dataset (CMU, 2003).
Dataset Splits No The paper mentions 'validation loss' and 'model validation set accuracy' (e.g., in Figure 3 caption) but does not provide specific details on the dataset splits (e.g., percentages or sample counts) used for validation.
Hardware Specification Yes We train five different sizes of the H-GAP model, ranging from 6M to 300M parameters, utilizing 8 Nvidia V100 GPUs for each model.
Software Dependencies No The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup Yes We leave more low-level details and hyperparameters in Appendix B. [...] Hyperparameter A comprehensive list of hyperparameter settings used in our experiments can be found in Table 5. [...] Detailed hyperparameters for all the baselines used in our experiments can be found in Table 4.