reproducibilityindex.ai

Smoothing Advantage Learning

Authors: Yaozhong Gan, Zhe Zhang, Xiaoyang Tan6657-6664

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present our experimental results conducted over six games (Lunarlander; Asterix, Breakout, Space invaders, Seaquest, Freeway) from Gym (Brockman et al. 2016) and Min Atar (Young and Tian 2019). In addition, we also run some experiments on Atari games in Appendix.
Researcher Affiliation	Academia	College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics
Pseudocode	Yes	Algorithm 1 gives the detailed implementation pipeline in Appendix.
Open Source Code	No	The paper does not provide a link to open-source code for the methodology described, nor does it explicitly state that the code is made publicly available.
Open Datasets	Yes	In this section, we present our experimental results conducted over six games (Lunarlander; Asterix, Breakout, Space invaders, Seaquest, Freeway) from Gym (Brockman et al. 2016) and Min Atar (Young and Tian 2019).
Dataset Splits	No	The paper mentions test procedures but does not specify training, validation, or test dataset splits (e.g., percentages or sample counts) for the environments used.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	Particularly, we choose α from the set of {0.2, 0.3, 0.5, 0.9} for AL (Bellemare et al. 2016). For SAL, we choose ω and α among {0.2, 0.3, 0.5, 0.9}, but the hyperparameters satisfy α < ω. For Munchausen-DQN (M-DQN) (Vieillard, Pietquin, and Geist 2020), we fix τ = 0.03, choose α from the set of {0.2, 0.3, 0.5, 0.9}.