Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents
Authors: Xian Yeow Lee, Sambit Ghadai, Kai Liang Tan, Chinmay Hegde, Soumik Sarkar4577-4584
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results showed that using the same amount of resources, the LAS attack deteriorates the agent s performance significantly more than the MAS attack. |
| Researcher Affiliation | Academia | 1Department of Mechanical Engineering, Iowa State University, Ames, IA 50011 2Tandon School of Engineering, New York University, Brooklyn, NY 11201 |
| Pseudocode | Yes | Algorithm 1: Look-ahead Action Space (LAS) Attack |
| Open Source Code | Yes | 3Codes and links to supplementary are available at https:// github.com/xylee95/Spatiotemporal-Attack-On-Deep-RL-Agents |
| Open Datasets | Yes | Open AI s gym (Brockman et al. 2016) |
| Dataset Splits | No | The paper describes using Open AI Gym environments (Lunar-Lander, Bipedal Walker, Mujoco Hopper, Half-Cheetah, Walker) and training RL agents (PPO, DDQN) but does not provide specific numerical or percentage splits for training, validation, or testing datasets within these environments. Reinforcement Learning typically involves interactive environments rather than static dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using PPO and Double DQN algorithms within Open AI Gym environments, but does not provide specific version numbers for any software dependencies, such as Python, PyTorch, TensorFlow, or Gym itself. |
| Experiment Setup | Yes | For MAS attacks, we implemented two different spatial projection schemes, ℓ1 projection based on (Condat 2016) that represents a sparser distribution and ℓ2 projection that represents a denser distribution of attacks. For LAS attacks, all combinations of spatial and temporal projection for ℓ1 and ℓ2 were implemented. ... Top three subplots show experiments with a H value of 5 time steps and b value of 3, 4, and 5 from left to right respectively. For a direct comparison, corresponding MAS budgets are taken as b = B/H. Similarly, Figs.(d-f) have the same B values but with H=10. |