reproducibilityindex.ai

Poisoning the Well: Can We Simultaneously Attack a Group of Learning Agents?

Authors: Ridhima Bector, Hang Xu, Abhay Aradhya, Chai Quek, Zinovi Rabinovich

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments are based on a 3D Grid World domain and show: a) feasibility, i.e., despite the uncertainty, the attack forces a population-wide adoption of target behavior; b) efficacy, i.e., the attack is size-agnostic and transferable.
Researcher Affiliation	Academia	Ridhima Bector , Hang Xu , Abhay Aradhya , Chai Quek and Zinovi Rabinovich Nanyang Technological University {ridhima001, hang017}@e.ntu.edu.sg, {abhayaradhya, ashcquek, zinovi}@ntu.edu.sg
Pseudocode	No	The paper describes its methods textually and with diagrams (Figure 1, Figure 2) but does not include any explicit pseudocode blocks or algorithms.
Open Source Code	Yes	Code and Appendices are available at bit.ly/github-rb-cep .
Open Datasets	Yes	This work tests and establishes the quality of the proposed methodology by training an attacker to learn to attack a population of navigational agents in a stochastic grid environment titled 3D Grid World [Rabinovich et al., 2010].
Dataset Splits	Yes	In this experiment, attack strategies are trained and tested on populations of same size. Each strategy is tested on 20 populations. In Implicit Collective scenario with Q-learning victim agents, 10 test populations use the same seed as the one used by the victim populations during training, while each agent in each of the remaining 10 populations uses a different seed. In Swarm and True Collective scenarios with DQN victim agents, neural networks corresponding to 10 test populations are initialized using random numbers from the same range as used during training, while the remaining 10 test populations are initialized using a different range.
Hardware Specification	No	The paper describes the algorithms used (e.g., Q-learning, DQN) and the experimental setup, but it does not specify any hardware components like CPU models, GPU models, or memory sizes used for running the experiments.
Software Dependencies	No	The paper mentions learning algorithms like Q-learning and DQN, and concepts like variational autoencoders (VAE) and Wasserstein distance, but it does not specify any software packages or libraries with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The attacker training episodes are 15-step sequential attacks on freshly initialized victim populations wherein attack step 0 corresponds to the original environment with default dynamics. After each episode, the attack strategy employed in that episode is saved if it is better or equal to the best attack strategy found so far, with respect to last-timestep, mean or cumulative value of at least one strategy quality criterion. Experiment H1 Concatenation vs Barycenter... In this experiment, attack strategies are trained and tested on populations of same size. Each strategy is tested on 20 populations. In Implicit Collective scenario with Q-learning victim agents, 10 test populations use the same seed as the one used by the victim populations during training, while each agent in each of the remaining 10 populations uses a different seed. In Swarm and True Collective scenarios with DQN victim agents, neural networks corresponding to 10 test populations are initialized using random numbers from the same range as used during training, while the remaining 10 test populations are initialized using a different range.