Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions

Authors: Ezgi Korkmaz, Jonah Brown-Cohen

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments in the Arcade Learning Environment with several different adversarial attack techniques. Most significantly, we demonstrate the effectiveness of our approach even in the setting where non-robust directions are explicitly optimized to circumvent our proposed method.
Researcher Affiliation Academia Correspondence to: Ezgi Korkmaz <ezgikorkmazmail@gmail.com>. Proceedings of the 40 th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright 2023 by the author(s).
Pseudocode Yes Algorithm 1 Second Order Identification of Non-Robust Directions (SO-INRD)
Open Source Code No The paper does not contain an explicit statement about the release of source code for the methodology or a link to a code repository.
Open Datasets Yes In our experiments agents are trained with DDQN (Wang et al., 2016) in the Arcade Learning Environment (ALE) (Bellemare et al., 2013) from Open AI (Brockman et al., 2016).
Dataset Splits No The paper describes calibrating the detection method by recording mean and variance from a 'base run' but does not specify a training/validation/test split for the detection method's dataset.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using DDQN and the Arcade Learning Environment but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup Yes We conducted exhaustive grid search over all the parameters in this optimization method: learning rate, iteration number, confidence parameter κ, and objective function parameter λ. In C&W we used up to 30000 iterations to find adversarial examples to bypass detection methods. We searched the confidence parameter from 0 to 50, the learning rate from 0.001 to 0.1, and λ from 0.001 to 10.