Reward Poisoning Attacks on Offline Multi-Agent Reinforcement Learning
Authors: Young Wu, Jeremy McMahan, Xiaojin Zhu, Qiaomin Xie
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We show that the attacker can install the target policy as a Markov Perfect Dominant Strategy Equilibrium (MPDSE), which rational agents are guaranteed to follow. We show that the attack works on various MARL agents, and we exhibit linear programs to efficiently solve the attack problem. We also study the relationship between the structure of the datasets and the minimal attack cost. Our work paves the way for studying defense in offline MARL. Our Contributions: We introduce reward-poisoning attacks in offline MARL. We show that any attack that reduces to attacking single-agent RL separately must be suboptimal. We present a reward-poisoning framework that guarantees the target policy π becomes a Markov Perfect Dominant Strategy Equilibrium (MPDSE) for the underlying Markov Game. We also show the attack can be efficiently constructed using a linear program. The proofs of the above results can be found in the appendix. |
| Researcher Affiliation | Academia | University of Wisconsin-Madison yw@cs.wisc.edu, jmcmahan@wisc.edu, jerryzhu@cs.wisc.edu, qiaomin.xie@wisc.edu |
| Pseudocode | No | The paper presents mathematical formulations for optimization problems (e.g., (1), (2), (3)-(7)) but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | No | The paper refers to a 'fixed batch dataset D' but does not name a specific public dataset or provide any access details (URL, DOI, citation) for it. |
| Dataset Splits | No | The paper does not specify training, validation, or test dataset splits. It discusses theoretical formulations and properties rather than empirical evaluations with data partitions. |
| Hardware Specification | No | The paper does not specify any hardware used for computations or experiments (e.g., GPU/CPU models, memory, cloud resources). |
| Software Dependencies | No | The paper mentions 'standard optimization solvers' but does not specify any particular software, libraries, or their version numbers. |
| Experiment Setup | No | The paper focuses on theoretical formulations and analysis of reward-poisoning attacks. It does not provide specific experimental setup details such as hyperparameter values, training configurations, or system-level settings typically found in empirical studies. |