H-GAP: Humanoid Control with a Generalist Planner
Authors: zhengyao jiang, Yingchen Xu, Nolan Wagener, Yicheng Luo, Michael Janner, Edward Grefenstette, Tim Rocktäschel, Yuandong Tian
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For 56 degrees of freedom humanoid, we empirically demonstrate that H-GAP learns to represent and generate a wide range of motor behaviours. Further, without any learning from online interactions, it can also flexibly transfer these behaviors to solve novel downstream control tasks via planning. Notably, H-GAP excels established MPC baselines that have access to the ground truth dynamics model, and is superior or comparable to offline RL methods trained for individual tasks. Finally, we do a series of empirical studies on the scaling properties of H-GAP, showing the potential for performance gains via additional data but not computing. |
| Researcher Affiliation | Collaboration | Zhengyao Jiang *, 1, 5 , Yingchen Xu *, 1, 2, Nolan Wagener 3, Yicheng Luo 1, Michael Janner 4, Edward Grefenstette 1, Tim Rockt aschel 1, Yuandong Tian 2 1 University College London, 2 AI at Meta, 3 Georgia Institute of Technology, 4 University of California at Berkeley, 5 Weco AI |
| Pseudocode | Yes | Algorithm 1 H-GAP Model Predictive Control |
| Open Source Code | Yes | Code and videos are available at https://ycxuyingchen.github.io/hgap/. |
| Open Datasets | Yes | Data. We use the Mo Cap Act dataset (Wagener et al., 2022), which contains over 500k rollouts with a total of 67M environment transitions (corresponding to 620 hours in the simulator) from a collection of expert Mo Cap tracking policies for a Mu Jo Co-based simulated humanoid, that can faithfully track 3.5 hours of various recorded motion from CMU Mo Cap dataset (CMU, 2003). |
| Dataset Splits | No | The paper mentions 'validation loss' and 'model validation set accuracy' (e.g., in Figure 3 caption) but does not provide specific details on the dataset splits (e.g., percentages or sample counts) used for validation. |
| Hardware Specification | Yes | We train five different sizes of the H-GAP model, ranging from 6M to 300M parameters, utilizing 8 Nvidia V100 GPUs for each model. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | We leave more low-level details and hyperparameters in Appendix B. [...] Hyperparameter A comprehensive list of hyperparameter settings used in our experiments can be found in Table 5. [...] Detailed hyperparameters for all the baselines used in our experiments can be found in Table 4. |