Learning Tree Interpretation from Object Representation for Deep Reinforcement Learning

Authors: Guiliang Liu, Xiangyu Sun, Oliver Schulte, Pascal Poupart

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our mimic tree achieves strong approximation performance with significantly fewer nodes than baseline models. We demonstrate the interpretability of our mimic tree by showing latent traversals, decision rules, causal impacts, and human evaluation results. Our empirical evaluation shows that IMONet+MCRTS achieves a promising mimic performance with significantly fewer splits than other baseline models and an important reduction in the size of tree interpretation.
Researcher Affiliation Collaboration Guiliang Liu1,3, Xiangyu Sun2, Oliver Schulte2, Pascal Poupart1,3 1Cheriton School of Computer Science, University of Waterloo 2School of Computing Science, Simon Fraser University 3Vector Institute
Pseudocode No The paper includes 'Figure 3: MCRTS structure' which is a diagram illustrating the components of MCRTS and its search process. However, it does not present a structured block of pseudocode or an algorithm explicitly labeled as such.
Open Source Code Yes Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Please check the attached code and Section A.2.2 in appendix for the data generation details.
Open Datasets Yes We study the Flappy Bird, Space Invaders, and Assaults environments. Flappy Bird is a procedural game, where the game states are randomly generated at each episode. Space Invaders and Assaults are commonly studied Atari games from the Gym toolkit [39].
Dataset Splits Yes We divide the dataset (50k) into training (80%), validation (10%), and testing (10%) sets and generate 5 independent runs.
Hardware Specification No The Acknowledgments section mentions a 'GPU donation from NVIDIA' but does not specify the model or configuration used for the experiments. Section 3(d) of the checklist refers to Appendix A.2.7 for compute details, implying that specific hardware information is not in the main text.
Software Dependencies No The paper does not explicitly mention specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch, TensorFlow, scikit-learn, etc.) in the main text.
Experiment Setup No The paper states: 'Check data generation details and model hyper-parameters in Appendix'. Section 3(b) of the checklist further confirms that implementation details and hyperparameters are in Appendix A.2.3. Therefore, these specific setup details are not provided within the main text.