Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Tactics of Adversarial Attack on Deep Reinforcement Learning Agents
Authors: Yen-Chen Lin, Zhang-Wei Hong, Yuan-Hong Liao, Meng-Li Shih, Ming-Yu Liu, Min Sun
IJCAI 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply the proposed tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically-timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. |
| Researcher Affiliation | Collaboration | Yen-Chen Lin1, Zhang-Wei Hong1, Yuan-Hong Liao1, Meng-Li Shih1, Min Sun1 1National Tsing Hua University, Taiwan 2NVIDIA, Santa Clara, California, USA |
| Pseudocode | No | No explicit pseudocode or algorithm block is present, although mathematical formulations for optimization problems and functions are provided. |
| Open Source Code | No | The paper states 'Our implementation will be released.' but does not provide a concrete link or access at the time of publication. |
| Open Datasets | Yes | We evaluated our tactics of adversarial attack to deep RL agents on 5 different Atari 2600 games (i.e., Ms Pacman, Pong, Seaquest, Qbert, and Chopper Command) using Open AI Gym [Brockman et al., 2016]. |
| Dataset Splits | No | The paper describes training and evaluation on Atari games, but does not explicitly provide numerical training/validation/test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | No specific hardware details (GPU, CPU models, memory, etc.) used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions software components like Open AI Gym, A3C, and DQN algorithms but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | The input to the neural network at time t was the concatenation of the last 4 images. Each of the images was resized to 84 84. The pixel value was rescaled to [0, 1]... We early stopped the optimizer when D(s, s + δ) < ϵ, where ϵ is a small value set to 0.007. The value of temperature T in Equation (4) is set to 1 in the experiments. |