Detecting Adversarial Directions in Deep Reinforcement Learning to Make Robust Decisions
Authors: Ezgi Korkmaz, Jonah Brown-Cohen
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments in the Arcade Learning Environment with several different adversarial attack techniques. Most significantly, we demonstrate the effectiveness of our approach even in the setting where non-robust directions are explicitly optimized to circumvent our proposed method. |
| Researcher Affiliation | Academia | Correspondence to: Ezgi Korkmaz <ezgikorkmazmail@gmail.com>. Proceedings of the 40 th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright 2023 by the author(s). |
| Pseudocode | Yes | Algorithm 1 Second Order Identification of Non-Robust Directions (SO-INRD) |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code for the methodology or a link to a code repository. |
| Open Datasets | Yes | In our experiments agents are trained with DDQN (Wang et al., 2016) in the Arcade Learning Environment (ALE) (Bellemare et al., 2013) from Open AI (Brockman et al., 2016). |
| Dataset Splits | No | The paper describes calibrating the detection method by recording mean and variance from a 'base run' but does not specify a training/validation/test split for the detection method's dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using DDQN and the Arcade Learning Environment but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | We conducted exhaustive grid search over all the parameters in this optimization method: learning rate, iteration number, confidence parameter κ, and objective function parameter λ. In C&W we used up to 30000 iterations to find adversarial examples to bypass detection methods. We searched the confidence parameter from 0 to 50, the learning rate from 0.001 to 0.1, and λ from 0.001 to 10. |