Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula

Authors: Aryaman Reddi, Maximilian Tölle, Jan Peters, Georgia Chalvatzaki, Carlo D'Eramo

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide extensive evidence of QARL outperforming RARL and recent baselines across several Mu Jo Co locomotion and navigation problems in overall performance and robustness. ... We demonstrate that our approach facilitates the learning of the protagonist, and we show that QARL outperforms RARL and related baselines in terms of performance and robustness across several high-dimensional Mu Jo Co problems, such as navigation and locomotion, of the Deep Mind Control Suite (Todorov et al., 2012; Tunyasuvunakool et al., 2020).
Researcher Affiliation Academia 1Department of Computer Science, TU Darmstadt, Germany 2Hessian Center for Artificial Intelligence (Hessian.ai), Germany 3German Research Center for AI (DFKI), Systems AI for Robot Learning, Germany 4Center for Cognitive Science, TU Darmstadt, Germany 5Center for Artificial Intelligence and Data Science, University of W urzburg, Germany
Pseudocode Yes Algorithm 1 Quantal Adversarial Reinforcement Learning (QARL)... Algorithm 2 Force-based curriculum adversarial reinforcement learning (Force-Curriculum)
Open Source Code Yes Code available at https://github.com/Aryaman Reddi99/quantal-adversarial-rl
Open Datasets Yes We consider a broad set of Mu Jo Co control problems (Todorov et al., 2012) from the Deep Mind Control Suite (Tunyasuvunakool et al., 2020)
Dataset Splits No The paper describes environments and experimental settings but does not provide specific numerical train/validation/test splits for any dataset. It mentions "training steps" and "test time" but not how the overall dataset is split.
Hardware Specification Yes The experiments are carried out on a computational cluster with 64GB of RAM and an AMD Ryzen 9 16-Core processor.
Software Dependencies No The algorithms investigated are implemented using the Mushroom RL (D Eramo et al., 2021) library, which is also used for the implementation of all agents and adversarial environment wrappers. The specific version number for Mushroom RL is not provided.
Experiment Setup Yes The hyperparameters of the adversarial agents across all environments are shown in Table 2. ... These environment-specific parameters are shown in Table 3... The algorithm parameters for Curriculum Adversarial Training (CAT) in Table 4 denote the iterations between which the gradient of the linear force budget curriculum would be constant and non-zero...