Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Towards Better Interpretability in Deep Q-Networks
Authors: Raghuram Mandyam Annasamy, Katia Sycara4561-4569
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We report the performance of our model on eight Atari environments (Brockman et al. 2016)Alien, Freeway, Frostbite, Gravitar, Ms Pacman, Qbert, Space Invaders, and Venture, in Table 1. |
| Researcher Affiliation | Academia | Raghuram Mandyam Annasamy Carnegie Mellon University EMAIL Katia Sycara Carnegie Mellon University EMAIL |
| Pseudocode | No | The paper describes its proposed method using equations and architecture diagrams, but it does not include a distinct pseudocode or algorithm block. |
| Open Source Code | Yes | Code available at https://github.com/maraghuram/I-DQN |
| Open Datasets | Yes | We report the performance of our model on eight Atari environments (Brockman et al. 2016)Alien, Freeway, Frostbite, Gravitar, Ms Pacman, Qbert, Space Invaders, and Venture, in Table 1. |
| Dataset Splits | No | The paper discusses training for a specific number of frames and evaluating performance but does not specify dataset splits (e.g., percentages or counts for training, validation, and testing sets) in the manner typically found in supervised learning, as is common for reinforcement learning environments. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | In general, we found the values λ1 = 1.0, λ2 = 1.0, λ3 = 0.05, λ4 = 0.01 to work well across games (detailed list of hyperparameters and their values is reported in the supplementary material). |