reproducibilityindex.ai

Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula

Authors: Aryaman Reddi, Maximilian Tölle, Jan Peters, Georgia Chalvatzaki, Carlo D'Eramo

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide extensive evidence of QARL outperforming RARL and recent baselines across several Mu Jo Co locomotion and navigation problems in overall performance and robustness. ... We demonstrate that our approach facilitates the learning of the protagonist, and we show that QARL outperforms RARL and related baselines in terms of performance and robustness across several high-dimensional Mu Jo Co problems, such as navigation and locomotion, of the Deep Mind Control Suite (Todorov et al., 2012; Tunyasuvunakool et al., 2020).
Researcher Affiliation	Academia	1Department of Computer Science, TU Darmstadt, Germany 2Hessian Center for Artificial Intelligence (Hessian.ai), Germany 3German Research Center for AI (DFKI), Systems AI for Robot Learning, Germany 4Center for Cognitive Science, TU Darmstadt, Germany 5Center for Artificial Intelligence and Data Science, University of W urzburg, Germany
Pseudocode	Yes	Algorithm 1 Quantal Adversarial Reinforcement Learning (QARL)... Algorithm 2 Force-based curriculum adversarial reinforcement learning (Force-Curriculum)
Open Source Code	Yes	Code available at https://github.com/Aryaman Reddi99/quantal-adversarial-rl
Open Datasets	Yes	We consider a broad set of Mu Jo Co control problems (Todorov et al., 2012) from the Deep Mind Control Suite (Tunyasuvunakool et al., 2020)
Dataset Splits	No	The paper describes environments and experimental settings but does not provide specific numerical train/validation/test splits for any dataset. It mentions "training steps" and "test time" but not how the overall dataset is split.
Hardware Specification	Yes	The experiments are carried out on a computational cluster with 64GB of RAM and an AMD Ryzen 9 16-Core processor.
Software Dependencies	No	The algorithms investigated are implemented using the Mushroom RL (D Eramo et al., 2021) library, which is also used for the implementation of all agents and adversarial environment wrappers. The specific version number for Mushroom RL is not provided.
Experiment Setup	Yes	The hyperparameters of the adversarial agents across all environments are shown in Table 2. ... These environment-specific parameters are shown in Table 3... The algorithm parameters for Curriculum Adversarial Training (CAT) in Table 4 denote the iterations between which the gradient of the linear force budget curriculum would be constant and non-zero...