reproducibilityindex.ai

BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning

Authors: Lun Wang, Zaynah Javed, Xian Wu, Wenbo Guo, Xinyu Xing, Dawn Song

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We prototype and evaluate BACKDOORL in four competitive environments. The results show that when the backdoor is activated, the winning rate of the victim drops by 17% to 37% compared to when not activated.
Researcher Affiliation	Academia	Lun Wang1 , Zaynah Javed1 , Xian Wu2 , Wenbo Guo2 , Xinyu Xing2 and Dawn Song1 1University of California, Berkeley 2Pennsylvania State University
Pseudocode	No	The paper describes the methodology in prose and uses diagrams (Figure 1, Figure 2) but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The videos are hosted at https://github.com/wanglun1996/multi agent rl backdoor videos.
Open Datasets	Yes	We evaluate BACKDOORL in four different environments [Bansal et al., 2017].
Dataset Splits	No	The paper mentions training and simulation, but does not specify explicit dataset splits (e.g., percentages or sample counts) for training, validation, or testing stages in the context of static datasets.
Hardware Specification	No	The paper does not provide specific hardware details (like CPU/GPU models, processor types, or memory amounts) used for running the experiments.
Software Dependencies	No	We implement BACKDOORL in about 1700 lines of Python code. For adversarial training, we leverage an implementation of Proximal Policy Optimization (PPO) from Stable Baselines [Hill et al., 2018].
Experiment Setup	Yes	To accelerate the failure, we introduce a constant penalty reward c (c < 0) for each time-step. The adversarial training typically needs 40 to 60 epochs to converge. The agent stably learns the backdoor functionality after around 150 epochs.