Decentralized Anomaly Detection in Cooperative Multi-Agent Reinforcement Learning
Authors: Kiarash Kazari, Ezzeldin Shereen, Gyorgy Dan
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive simulations on various multi-agent benchmarks show the effectiveness of the proposed detection scheme in detecting state of the art attacks and in limiting the impact of undetectable attacks. In this section, we evaluate our detection scheme against state-of-the-art adversarial attacks, as well as the dynamic attack proposed in Section 5. We use three test environments for evaluating the proposed detector: Star Craft II Multi-Agent Challenge (SMAC) [Samvelyan et al., 2019], Multi Particle Environment (MPE) [Mordatch and Abbeel, 2017], and Level-Based Foraging (LBF) [Papoudakis et al., 2021]. |
| Researcher Affiliation | Academia | Kiarash Kazari , Ezzeldin Shereen , Gy orgy D an Division of Network and Systems Engineering, School of Electrical Engineering and Computer Science KTH Royal Institute of Technology, Stockholm, Sweden {kkazari, eshereen, gyuri}@kth.se |
| Pseudocode | No | The paper describes algorithms and procedures in text, but it does not provide any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available at https://github.com/kiarashkaz/anomaly-detection-in-c MARL |
| Open Datasets | Yes | We use three test environments for evaluating the proposed detector: Star Craft II Multi-Agent Challenge (SMAC) [Samvelyan et al., 2019], Multi Particle Environment (MPE) [Mordatch and Abbeel, 2017], and Level-Based Foraging (LBF) [Papoudakis et al., 2021]. Using the implementation provided by the Py MARL [Samvelyan et al., 2019] and EPy MARL [Papoudakis et al., 2021] Python frameworks, we trained the agents. |
| Dataset Splits | No | The paper mentions training and testing but does not explicitly describe validation splits or cross-validation for its own experiments. It refers to standard benchmarks like SMAC, MPE, and LBF, which typically have predefined splits, but the paper itself does not specify them. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., CPU, GPU models, memory). |
| Software Dependencies | No | The paper mentions software frameworks like PyMARL and EPyMARL, and algorithms like QMIX, MAA2C, and DRQN, but it does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | The hidden state dimension of the GRU layer was 64 for SMAC-2s3z and 128 for the other scenarios. We trained the predictors for 20,000 episodes during which agents applied the policies learned in Step 1. For each attack we used a single-agent DRQN algorithm [Hausknecht and Stone, 2015] for 20,000 episodes to obtain the adversary. |