reproducibilityindex.ai

Gamma-Nets: Generalizing Value Estimation over Timescale

Authors: Craig Sherstan, Shibhansh Dohare, James MacGlashan, Johannes Günther, Patrick M. Pilarski5717-5725

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ﬁrst provide two demonstrations by 1) predicting a square wave and 2) predicting sensorimotor signals on a robot arm using a linear function approximator. Next, we empirically evaluate Γ-nets in the deep reinforcement learning setting using policy evaluation on a set of Atari video games.
Researcher Affiliation	Collaboration	1Department of Computing Science, University of Alberta Edmonton, Alberta, Canada 2Cogitai, USA
Pseudocode	No	The paper does not include any figure, block, or section labeled 'Pseudocode' or 'Algorithm', nor are there structured steps formatted like code.
Open Source Code	No	The paper mentions 'Additional results and experimental details are available from Sherstan et al. (2019).' which cites an arXiv preprint (1911.07794). This does not constitute an explicit statement of code release or a direct link to a code repository for the methodology.
Open Datasets	Yes	We examined the performance of Γ-nets under policy evaluation in the Arcade Learning Environment (ALE) (Bellemare et al. 2015).
Dataset Splits	No	The paper describes how 'evaluation points' were created and used to compute returns, and how the models were trained, but it does not specify explicit training, validation, and test dataset splits with percentages or counts, or refer to predefined splits for reproducibility.
Hardware Specification	No	The paper describes the software components and training duration (e.g., 'Rainbow agent', 'trained for 25 million frames'), but it does not specify any hardware details such as GPU/CPU models or types of computational resources used for running the experiments.
Software Dependencies	No	The paper mentions the use of specific agents and frameworks (e.g., 'Dopamine project’s implementation of the Rainbow agent', 'DQN agent', 'prioritized replay', 'n-step returns', 'distributional representation of the value estimates'), but it does not provide specific version numbers for any key software components or libraries (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	The Γ-net network consisted of ﬁve fully-connected layers of sizes [512, 256, 128, 16, 1], with all but the ﬁnal layer using Re LU activation. ... Each network was trained for 20 M frames... A Γt of size 8 was used, which always included lower and upper bounds of τ = [1, 100]. An additional 6 γk were drawn on each timestep. Unless otherwise stated the sampling was done by drawing 3 timescales uniformly each from the γ scale on [0, 0.99) and the τ scale on [1, 100) (for τ we drew from the integer scales).