Reward Poisoning Attacks on Offline Multi-Agent Reinforcement Learning

Authors: Young Wu, Jeremy McMahan, Xiaojin Zhu, Qiaomin Xie

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We show that the attacker can install the target policy as a Markov Perfect Dominant Strategy Equilibrium (MPDSE), which rational agents are guaranteed to follow. We show that the attack works on various MARL agents, and we exhibit linear programs to efficiently solve the attack problem. We also study the relationship between the structure of the datasets and the minimal attack cost. Our work paves the way for studying defense in offline MARL. Our Contributions: We introduce reward-poisoning attacks in offline MARL. We show that any attack that reduces to attacking single-agent RL separately must be suboptimal. We present a reward-poisoning framework that guarantees the target policy π becomes a Markov Perfect Dominant Strategy Equilibrium (MPDSE) for the underlying Markov Game. We also show the attack can be efficiently constructed using a linear program. The proofs of the above results can be found in the appendix.
Researcher Affiliation Academia University of Wisconsin-Madison yw@cs.wisc.edu, jmcmahan@wisc.edu, jerryzhu@cs.wisc.edu, qiaomin.xie@wisc.edu
Pseudocode No The paper presents mathematical formulations for optimization problems (e.g., (1), (2), (3)-(7)) but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets No The paper refers to a 'fixed batch dataset D' but does not name a specific public dataset or provide any access details (URL, DOI, citation) for it.
Dataset Splits No The paper does not specify training, validation, or test dataset splits. It discusses theoretical formulations and properties rather than empirical evaluations with data partitions.
Hardware Specification No The paper does not specify any hardware used for computations or experiments (e.g., GPU/CPU models, memory, cloud resources).
Software Dependencies No The paper mentions 'standard optimization solvers' but does not specify any particular software, libraries, or their version numbers.
Experiment Setup No The paper focuses on theoretical formulations and analysis of reward-poisoning attacks. It does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings typically found in empirical studies.