Towards Better Interpretability in Deep Q-Networks
Authors: Raghuram Mandyam Annasamy, Katia Sycara4561-4569
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report the performance of our model on eight Atari environments (Brockman et al. 2016)Alien, Freeway, Frostbite, Gravitar, Ms Pacman, Qbert, Space Invaders, and Venture, in Table 1. |
| Researcher Affiliation | Academia | Raghuram Mandyam Annasamy Carnegie Mellon University rannasam@cs.cmu.edu Katia Sycara Carnegie Mellon University katia@cs.cmu.edu |
| Pseudocode | No | The paper describes its proposed method using equations and architecture diagrams, but it does not include a distinct pseudocode or algorithm block. |
| Open Source Code | Yes | Code available at https://github.com/maraghuram/I-DQN |
| Open Datasets | Yes | We report the performance of our model on eight Atari environments (Brockman et al. 2016)Alien, Freeway, Frostbite, Gravitar, Ms Pacman, Qbert, Space Invaders, and Venture, in Table 1. |
| Dataset Splits | No | The paper discusses training for a specific number of frames and evaluating performance but does not specify dataset splits (e.g., percentages or counts for training, validation, and testing sets) in the manner typically found in supervised learning, as is common for reinforcement learning environments. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | In general, we found the values λ1 = 1.0, λ2 = 1.0, λ3 = 0.05, λ4 = 0.01 to work well across games (detailed list of hyperparameters and their values is reported in the supplementary material). |