Efficient Adversarial Attacks on Online Multi-agent Reinforcement Learning

Authors: Guanlin Liu, Lifeng LAI

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Numerical Results In this section, we empirically compare the performance of the action poisoning only attack strategy (d-portion attack), the reward poisoning only attack strategy (η-gap attack) and the mixed attack strategy. We consider a simple case of Markov game where m = 2, H = 2 and |S| = 3. This Markov game is the example in Appendix F.2.
Researcher Affiliation Academia Guanlin Liu Lifeng Lai Department of Electrical and Computer Engineering University of California, Davis One Shields Avenue, Davis, CA 95616 {glnliu, lflai}@ucdavis.edu
Pseudocode Yes Algorithm 1: Exploration phase for Markov games
Open Source Code No The paper does not explicitly state that source code for the described methodology is provided or publicly available.
Open Datasets No We consider a simple case of Markov game where m = 2, H = 2 and |S| = 3. This Markov game is the example in Appendix F.2.
Dataset Splits No The paper does not provide specific dataset split information (percentages, sample counts, citations, or detailed methodology) for training, validation, and testing.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions V-learning as an algorithm used but does not provide specific version numbers for any ancillary software components like programming languages, libraries, or solvers.
Experiment Setup Yes We set the total number of episodes K = 107. We set the total number of steps H = 6. [...] Suppose ADV_BANDIT_UPDATE of V-learning follows Algorithm 3 in Appendix J.2 and it chooses hyper-parameter wt = αt Qt i=2(1 αi) 1 , γt = q Bt and αt = H+1 H+t .