Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response
Authors: Junfeng Long, ZiRui Wang, Quanyi Li, Liu Cao, Jiawei Gao, Jiangmiao Pang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We train our method in Isaac Gym (Makoviychuk et al., 2021) and deploy it on Unitree Aliengo, A1, and Go1 robots. We evaluate and ablate its performance in both simulation and real-world regimes with carefully designed benchmarks and metrics. Experiments show that HIM can use minimal sensors, i.e., joint encoders and IMU, to drive a robot to traverse across any terrain under any disturbances. A wealth of real-world experiments demonstrates its agility, even in high-difficulty tasks and cases never occurred during the training process, revealing remarkable open-world generalizability. |
| Researcher Affiliation | Collaboration | Junfeng Long1 , Zirui Wang1,2 , Quanyi Li1, Jiawei Gao1,3, Liu Cao1,3, Jiangmiao Pang1 1Open Robot Lab, Shanghai AI Laboratory, 2Zhejiang University, 3Tsinghua University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions 'Project page at this URL.' and 'The demo videos of our method can be found at the Project Page.' but does not explicitly state that the source code for the methodology is provided or include a direct link to a code repository. |
| Open Datasets | No | The paper primarily describes data generation within the Isaac Gym simulation environment for training ('numerous and diverse simulated data', 'training process needs 1000 rollouts') rather than using a pre-existing publicly available dataset with concrete access information. |
| Dataset Splits | No | The paper describes a training process within a simulation environment ('Isaac Gym with 4096 parallel environments and a rollout length of 100 time steps') and a training curriculum, but it does not specify explicit train/validation/test dataset splits as would be typical for a fixed dataset. |
| Hardware Specification | Yes | It only requires 1 hour of training on an RTX 4090 to enable a quadruped robot to traverse any terrain under any disturbances. |
| Software Dependencies | No | The paper mentions 'Isaac Gym' as the simulation environment but does not provide specific version numbers for it or any other software dependencies. |
| Experiment Setup | Yes | Simulation Setup. We use Isaac Gym (Rudin et al., 2022) with 4096 parallel environments and a rollout length of 100 time steps. The training process needs 1000 rollouts which takes 1 hour of wall clock time on NVIDIA RTX 4090. But its performance continues improving until 2000 rollouts. Table 7: Hyper Parameters for Training Batch Size 4096 200 Mini-batch Size 4096 50 Number of epochs 5 Clip range 0.2 Entropy coefficient 0.01 Discount factor 0.99 GAE discount factor 0.95 Desired KL-divergence 0.01 Learning rate 1 10 3 Adam epsilon 1 10 8 Gradient clipping 10 Number of prototypes 16 Contrastive loss scale 1.0 Velocity estimation loss scale 1.0 Learning rate 1 10 3 Adam epsilon 1 10 8 Gradient clipping 10 |