Re-understanding Finite-State Representations of Recurrent Policy Networks

Authors: Mohamad H Danesh, Anurag Koul, Alan Fern, Saeed Khorram

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our case studies on 7 Atari games and 3 control benchmarks demonstrate that the approach can reveal insights that have not been previously noticed.
Researcher Affiliation Academia 1Department of EECS, Oregon State University, Corvallis, OR, USA. Correspondence to: Mohamad H. Danesh <daneshm@oregonstate.edu>.
Pseudocode Yes To do this, we conduct a simple form of functional pruning (details and pseudo-code in appendix), to identify and prune unnecessary branches at decision points.
Open Source Code Yes Source code is available at: github.com/modanesh/Differential_IG
Open Datasets Yes We consider 7 deterministic Atari environments: Bowling, Boxing, Breakout, Ms Pacman, Pong, Sea Quest and Space Invaders, and 3 stochastic discrete-action classic control tasks: Acrobot, Cart Pole, and Lunar Lander.
Dataset Splits No The paper describes the environments and discusses performance but does not specify dataset splits (e.g., percentages or counts for training, validation, or test sets) for reproducibility.
Hardware Specification No The paper does not explicitly mention specific hardware components (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper mentions algorithms like A3C and R2D2, and approaches like Integrated Gradient, but does not provide specific version numbers for any software libraries or tools used (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup No Detailed information on these choices along with hyperparameter choices are in the appendix.