Revisiting Domain Randomization via Relaxed State-Adversarial Policy Optimization

Authors: Yun-Hsuan Lien, Ping-Chun Hsieh, Yu-Shuen Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method by comparing it to state-of-the-art methods, providing experimental results and theoretical proofs to verify its effectiveness.Our source code and appendix are available at https://github.com/sophialien/RAPPO.We performed two experiments on the Mu Jo Co platform (Todorov et al., 2012) to assess the performance of our relaxed state adversarial policy optimization (RAPPO) against various adversaries.
Researcher Affiliation Academia Yun-Hsuan Lien 1 Ping-Chun Hsieh 1 Yu-Shuen Wang 1 1National Yang Ming Chiao Tung University, Hsinchu, Taiwan. Correspondence to: Yun-Hsuan Lien <sophia.yh.lien@gmail.com>.
Pseudocode Yes Algorithm 1 outlines the steps of our approach.
Open Source Code Yes Our source code and appendix are available at https://github.com/sophialien/RAPPO.
Open Datasets Yes We performed two experiments on the Mu Jo Co platform (Todorov et al., 2012) to assess the performance of our relaxed state adversarial policy optimization (RAPPO) against various adversaries.
Dataset Splits No The paper mentions
Hardware Specification No No specific hardware (GPU model, CPU model, memory) is mentioned for running the experiments.
Software Dependencies No The paper mentions
Experiment Setup Yes We set to 0.015, 0.002, 0.002, 0.03, and 0.005 for the Half Cheetah-v2, Hopper-v2, Ant-v2, Walker2d-v2, and Humanoid-v2 environments, respectively. These values were chosen based on the mean magnitude of actions taken in each environment.The baselines and our method were implemented using the PPO algorithm (Schulman et al., 2017), and the default parameters were used.