reproducibilityindex.ai

Expected Eligibility Traces

Authors: Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa9997-10005

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical Analysis From the insights above, we expect that ET(λ) yields lower prediction errors because it has lower variance and aggregates information across episodes better. In this section we empirically investigate expected traces in several experiments.
Researcher Affiliation	Collaboration	Hado van Hasselt1, Sephora Madjiheurem2, Matteo Hessel1 David Silver1, Andr e Barreto1, Diana Borsa1 1 Deep Mind 2 University College London, UK
Pseudocode	Yes	Algorithm 1 ET(λ)
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We tested this idea on two canonical Atari games: Pong and Ms. Pac-Man. The results in Figure 6 show that the expected traces helped speed up learning compared to the baseline which uses accumulating traces, for various step sizes. ... Bellemare, M. G.; Naddaf, Y.; Veness, J.; and Bowling, M. 2013. The Arcade Learning Environment: An Evaluation Platform for General Agents. J. Artif. Intell. Res. (JAIR) 47: 253 279.
Dataset Splits	No	The paper does not provide specific details regarding dataset splits for training, validation, or testing used in its experiments.
Hardware Specification	No	The paper does not provide specific details on the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions software like JAX and Haiku, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	We found that being able to track this rather quickly improved performance: the expected trace parameters Θ in the following experiment were updated with a relatively high step size of β = 0.1. ... All results are for λ = 0.95. Further implementation details and hyper-parameters are in the appendix.