Resilient Multi-Agent Reinforcement Learning with Adversarial Value Decomposition
Authors: Thomy Phan, Lenz Belzner, Thomas Gabor, Andreas Sedlmeier, Fabian Ritz, Claudia Linnhoff-Popien11308-11316
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate RADAR in two cooperative multi-agent domains and show that RADAR achieves better worst case performance w.r.t. arbitrary agent changes than state-of-the-art MARL. An empirical evaluation of RADAR in two cooperative multi-agent domains and a comparison with state-of-the-art MARL w.r.t. the proposed test scheme. |
| Researcher Affiliation | Collaboration | Thomy Phan,1 Lenz Belzner,2 Thomas Gabor,1 Andreas Sedlmeier,1 Fabian Ritz,1 Claudia Linnhoff-Popien1 1LMU Munich, 2Maiborn Wolff thomy.phan@ifi.lmu.de |
| Pseudocode | Yes | Algorithm 1 Randomized Adversarial Training (RAT) |
| Open Source Code | Yes | Code available at https://github.com/thomyphan/resilient-marl |
| Open Datasets | No | We implemented a predator-prey (PP) and a cyber-physical production system (CPPS) domain with N agents. The paper describes custom environments/domains that were implemented for the experiments, but does not provide concrete access information (link, DOI, formal citation) for a publicly available dataset. |
| Dataset Splits | No | The paper discusses training runs and testing, but does not specify explicit train/validation/test dataset splits with percentages or sample counts for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'ADAM' as an optimizer but does not specify versions for any key software components or libraries. |
| Experiment Setup | Yes | The neural networks are updated every 1000 time steps using ADAM with a learning rate of 0.001. We set γ = 0.95, T = 4000, and Ne = 10 (Algorithm 1). |