Robust Imitation Learning against Variations in Environment Dynamics
Authors: Jongseong Chae, Seungyul Han, Whiyoung Jung, Myungsik Cho, Sungho Choi, Youngchul Sung
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical results show that our algorithm significantly improves robustness against dynamics perturbations compared to conventional IL baselines. |
| Researcher Affiliation | Academia | 1School of Electrical Engineering, KAIST, Daejeon, South Korea. 2Artificial Intelligence Graduate School, UNIST, Ulsan, South Korea. |
| Pseudocode | Yes | Algorithm 1 Robust Imitation learning with Multiple perturbed Environments (RIME) |
| Open Source Code | Yes | The source code of the proposed algorithm is available at https://github.com/JongseongChae/RIME. |
| Open Datasets | Yes | We experimented the considered algorithms on Mu Jo Co tasks: Hopper, Walker2d, Half Cheetah and Ant (Todorov et al., 2012). |
| Dataset Splits | Yes | All expert demonstrations are split out 70% training dataset and 30% validation dataset. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models, memory) were provided for the experimental setup. |
| Software Dependencies | No | No specific version numbers for software dependencies (e.g., PyTorch version, MuJoCo version used for experiments) were provided. |
| Experiment Setup | Yes | The batch size is set to 2048, the number of update epochs for the policy at one iteration is set to 4, and the number of update epochs for the discriminator at one iteration is set to 5. Finally, the coefficient of the GP term is set to 10, and the coefficient of entropy for PPO is 0. |