reproducibilityindex.ai

Revisiting Domain Randomization via Relaxed State-Adversarial Policy Optimization

Authors: Yun-Hsuan Lien, Ping-Chun Hsieh, Yu-Shuen Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method by comparing it to state-of-the-art methods, providing experimental results and theoretical proofs to verify its effectiveness.Our source code and appendix are available at https://github.com/sophialien/RAPPO.We performed two experiments on the Mu Jo Co platform (Todorov et al., 2012) to assess the performance of our relaxed state adversarial policy optimization (RAPPO) against various adversaries.
Researcher Affiliation	Academia	Yun-Hsuan Lien 1 Ping-Chun Hsieh 1 Yu-Shuen Wang 1 1National Yang Ming Chiao Tung University, Hsinchu, Taiwan. Correspondence to: Yun-Hsuan Lien <sophia.yh.lien@gmail.com>.
Pseudocode	Yes	Algorithm 1 outlines the steps of our approach.
Open Source Code	Yes	Our source code and appendix are available at https://github.com/sophialien/RAPPO.
Open Datasets	Yes	We performed two experiments on the Mu Jo Co platform (Todorov et al., 2012) to assess the performance of our relaxed state adversarial policy optimization (RAPPO) against various adversaries.
Dataset Splits	No	The paper mentions
Hardware Specification	No	No specific hardware (GPU model, CPU model, memory) is mentioned for running the experiments.
Software Dependencies	No	The paper mentions
Experiment Setup	Yes	We set to 0.015, 0.002, 0.002, 0.03, and 0.005 for the Half Cheetah-v2, Hopper-v2, Ant-v2, Walker2d-v2, and Humanoid-v2 environments, respectively. These values were chosen based on the mean magnitude of actions taken in each environment.The baselines and our method were implemented using the PPO algorithm (Schulman et al., 2017), and the default parameters were used.