Efficient Adversarial Attacks on Online Multi-agent Reinforcement Learning
Authors: Guanlin Liu, Lifeng LAI
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Numerical Results In this section, we empirically compare the performance of the action poisoning only attack strategy (d-portion attack), the reward poisoning only attack strategy (η-gap attack) and the mixed attack strategy. We consider a simple case of Markov game where m = 2, H = 2 and |S| = 3. This Markov game is the example in Appendix F.2. |
| Researcher Affiliation | Academia | Guanlin Liu Lifeng Lai Department of Electrical and Computer Engineering University of California, Davis One Shields Avenue, Davis, CA 95616 {glnliu, lflai}@ucdavis.edu |
| Pseudocode | Yes | Algorithm 1: Exploration phase for Markov games |
| Open Source Code | No | The paper does not explicitly state that source code for the described methodology is provided or publicly available. |
| Open Datasets | No | We consider a simple case of Markov game where m = 2, H = 2 and |S| = 3. This Markov game is the example in Appendix F.2. |
| Dataset Splits | No | The paper does not provide specific dataset split information (percentages, sample counts, citations, or detailed methodology) for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions V-learning as an algorithm used but does not provide specific version numbers for any ancillary software components like programming languages, libraries, or solvers. |
| Experiment Setup | Yes | We set the total number of episodes K = 107. We set the total number of steps H = 6. [...] Suppose ADV_BANDIT_UPDATE of V-learning follows Algorithm 3 in Appendix J.2 and it chooses hyper-parameter wt = αt Qt i=2(1 αi) 1 , γt = q Bt and αt = H+1 H+t . |