Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Deep Reinforcement Learning Policies Learn Shared Adversarial Features across MDPs
Authors: Ezgi Korkmaz7229-7238
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments in various games from Arcade Learning Environment, and discover that high sensitivity directions for neural policies are correlated across MDPs. ... Our main contributions are as follows: ... Via experiments in the Arcade Learning Environment we rigorously show that the high-sensitivity directions computed in our framework correlate strongly across states and in several cases across MDPs. |
| Researcher Affiliation | Academia | The paper lists the author as Ezgi Korkmaz. No institutional affiliation or email domain is provided within the paper's text to determine the author's affiliation type. |
| Pseudocode | Yes | Algorithm 1: High-sensitivity directions with Arandom alg |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the release of source code for the methodology described. |
| Open Datasets | Yes | The Arcade Learning Environment (ALE) is used as a standard baseline... In our experiments agents are trained with Double Deep Q-Network (DDQN) proposed by Wang et al. (2016) with prioritized experience replay Schaul et al. (2016) in the ALE introduced by Bellemare et al. (2013) with the Open AI baselines version Brockman et al. (2016). |
| Dataset Splits | No | The paper does not explicitly provide information on training/validation/test dataset splits, which is common for reinforcement learning environments where data is generated through interaction rather than being a static dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions software components like 'Double Deep Q-Network (DDQN)', 'prioritized experience replay', and 'Open AI baselines version', but it does not provide specific version numbers for these components. |
| Experiment Setup | No | The paper mentions the algorithms used for training agents (DDQN, SA-DDQN) and sets an â2-norm bound Îș for their framework, but it does not specify detailed experimental setup parameters such as learning rates, batch sizes, optimizers, or other hyperparameters for agent training. |