Robust Imitation Learning against Variations in Environment Dynamics

Authors: Jongseong Chae, Seungyul Han, Whiyoung Jung, Myungsik Cho, Sungho Choi, Youngchul Sung

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical results show that our algorithm significantly improves robustness against dynamics perturbations compared to conventional IL baselines.
Researcher Affiliation Academia 1School of Electrical Engineering, KAIST, Daejeon, South Korea. 2Artificial Intelligence Graduate School, UNIST, Ulsan, South Korea.
Pseudocode Yes Algorithm 1 Robust Imitation learning with Multiple perturbed Environments (RIME)
Open Source Code Yes The source code of the proposed algorithm is available at https://github.com/JongseongChae/RIME.
Open Datasets Yes We experimented the considered algorithms on Mu Jo Co tasks: Hopper, Walker2d, Half Cheetah and Ant (Todorov et al., 2012).
Dataset Splits Yes All expert demonstrations are split out 70% training dataset and 30% validation dataset.
Hardware Specification No No specific hardware details (like GPU/CPU models, memory) were provided for the experimental setup.
Software Dependencies No No specific version numbers for software dependencies (e.g., PyTorch version, MuJoCo version used for experiments) were provided.
Experiment Setup Yes The batch size is set to 2048, the number of update epochs for the policy at one iteration is set to 4, and the number of update epochs for the discriminator at one iteration is set to 5. Finally, the coefficient of the GP term is set to 10, and the coefficient of entropy for PPO is 0.