EDGE: Explaining Deep Reinforcement Learning Policies
Authors: Wenbo Guo, Xian Wu, Usmann Khan, Xinyu Xing
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments on Atari and Mu Jo Co games, we verify the explanation fidelity of our method and demonstrate how to employ interpretation to understand agent behavior, discover policy vulnerabilities, remediate policy errors, and even defend against adversarial attacks. |
| Researcher Affiliation | Academia | Wenbo Guo The Pennsylvania State University wzg13@ist.psu.edu; Xian Wu The Pennsylvania State University xkw5132@psu.edu; Usmann Khan Georgia Institute of Technology ukhan35@gatech.edu; Xinyu Xing Northwestern University The Pennsylvania State University xinyu.xing@northwestern.edu |
| Pseudocode | No | The paper describes its methodology using mathematical equations and prose but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code of EDGE can be found in https://github.com/Henrygwb/edge. |
| Open Datasets | Yes | In this section, we evaluate EDGE on three representative RL games (all with delayed rewards) Pong in Atari, You-Shall-Not-Pass in Mu Jo Co, and Kick-And-Defend in Mu Jo Co. Supplement S5 further demonstrates the effectiveness of our method on two Open AI GYM games (both with instant rewards). |
| Dataset Splits | No | The paper states that it 'collects a set of episodes' to fit its model but does not provide specific details on how this collected data is split into training, validation, or test sets for the purpose of training and evaluating the EDGE model itself. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for running its experiments, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python version, PyTorch version, specific library versions) that would be needed to replicate the experiment environment. |
| Experiment Setup | No | The paper states: 'Implementation details and hyper-parameter choices can be found in Supplement S2.' This indicates the information is not in the main text of the paper. |