reproducibilityindex.ai

Two-Timescale Networks for Nonlinear Value Function Approximation

Authors: Wesley Chung, Somjit Nath, Ajin Joseph, Martha White

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate the beneﬁts of TTNs, compared to other nonlinear value function approximation algorithms, both for policy evaluation and control.
Researcher Affiliation	Academia	Wesley Chung, Somjit Nath, Ajin George Joseph and Martha White Department of Computing Science University of Alberta
Pseudocode	Yes	Algorithm 1 Training of TTNs; Algorithm 2 TD(λ) algorithm; Algorithm 3 GTD2 algorithm
Open Source Code	No	The paper does not contain any statements or links indicating the release of open-source code for the described methodology.
Open Datasets	Yes	We use the Open AI gym implementation (Brockman et al., 2016).; In Puck World (Tasﬁ, 2016)
Dataset Splits	No	The paper describes how value estimates are evaluated (using 500 states for RMSVE), but it does not specify a training/validation/test split for the overall dataset used in the experiments.
Hardware Specification	No	The paper discusses computational aspects like O(d2) complexity but does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions "AMSGrad optimizer (Reddi et al., 2018)", "Pygame Learning Environment (Tasfi, 2016)", and "Open AI gym implementation (Brockman et al., 2016)", but it does not specify version numbers for any programming languages, libraries, or other software dependencies.
Experiment Setup	Yes	To choose hyperparameters, we ﬁrst did a preliminary sweep on a broad range and then chose a smaller range where the algorithms usually made progress, summarized in Appendix D. Results are reported for hyperparameters in the reﬁned range, chosen based on RMSVE over the latter half of a run with shaded regions corresponding to one standard error.