reproducibilityindex.ai

Efficient Adversarial Attacks on Online Multi-agent Reinforcement Learning

Authors: Guanlin Liu, Lifeng LAI

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Numerical Results In this section, we empirically compare the performance of the action poisoning only attack strategy (d-portion attack), the reward poisoning only attack strategy (η-gap attack) and the mixed attack strategy. We consider a simple case of Markov game where m = 2, H = 2 and \|S\| = 3. This Markov game is the example in Appendix F.2.
Researcher Affiliation	Academia	Guanlin Liu Lifeng Lai Department of Electrical and Computer Engineering University of California, Davis One Shields Avenue, Davis, CA 95616 {glnliu, lflai}@ucdavis.edu
Pseudocode	Yes	Algorithm 1: Exploration phase for Markov games
Open Source Code	No	The paper does not explicitly state that source code for the described methodology is provided or publicly available.
Open Datasets	No	We consider a simple case of Markov game where m = 2, H = 2 and \|S\| = 3. This Markov game is the example in Appendix F.2.
Dataset Splits	No	The paper does not provide specific dataset split information (percentages, sample counts, citations, or detailed methodology) for training, validation, and testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions V-learning as an algorithm used but does not provide specific version numbers for any ancillary software components like programming languages, libraries, or solvers.
Experiment Setup	Yes	We set the total number of episodes K = 107. We set the total number of steps H = 6. [...] Suppose ADV_BANDIT_UPDATE of V-learning follows Algorithm 3 in Appendix J.2 and it chooses hyper-parameter wt = αt Qt i=2(1 αi) 1 , γt = q Bt and αt = H+1 H+t .