Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Self-Supervised Attention-Aware Reinforcement Learning
Authors: Haiping Wu, Khimya Khetarpal, Doina Precup10311-10319
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show that the self-supervised attention-aware deep RL methods outperform the baselines in the context of both the rate of convergence and performance. Furthermore, the proposed self-supervised attention is not tied with specific policies, nor restricted to a specific scene. We posit that the proposed approach is a general self-supervised attention module for multi-task learning and transfer learning, and empirically validate the generalization ability of the proposed method. |
| Researcher Affiliation | Collaboration | Haiping Wu,1, 2 Khimya Khetarpal,1, 2 Doina Precup 1, 2, 3 1 Mc Gill University 2 Mila 3 Google Deep Mind, Montreal. EMAIL, EMAIL |
| Pseudocode | Yes | The pseudocode is provided in the appendix. |
| Open Source Code | Yes | The source code is available here. 1https://github.com/happywu/Self-Sup-Attention-RL |
| Open Datasets | Yes | We evaluate our method on Atari ALE (Bellemare et al. 2013; Brockman et al. 2016) games. |
| Dataset Splits | No | The paper mentions 'The implementation details including experiment setups, network architectures and hyperparameters are provided in the appendix.' and shows results averaged over '5 random seeds' but does not provide specific percentages or counts for training, validation, or test splits in the main text. |
| Hardware Specification | No | The paper states 'The implementation details including experiment setups, network architectures and hyperparameters are provided in the appendix.' but does not include any specific hardware details like GPU or CPU models in the main body. |
| Software Dependencies | No | While the paper provides a link to the source code, it does not explicitly list specific software dependencies with their version numbers within the text. |
| Experiment Setup | No | The paper states 'The implementation details including experiment setups, network architectures and hyperparameters are provided in the appendix.' but does not provide specific hyperparameters or system-level training settings in the main text. |