Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Expected Eligibility Traces
Authors: Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver, André Barreto, Diana Borsa9997-10005
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical Analysis From the insights above, we expect that ET(λ) yields lower prediction errors because it has lower variance and aggregates information across episodes better. In this section we empirically investigate expected traces in several experiments. |
| Researcher Affiliation | Collaboration | Hado van Hasselt1, Sephora Madjiheurem2, Matteo Hessel1 David Silver1, Andr e Barreto1, Diana Borsa1 1 Deep Mind 2 University College London, UK |
| Pseudocode | Yes | Algorithm 1 ET(λ) |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We tested this idea on two canonical Atari games: Pong and Ms. Pac-Man. The results in Figure 6 show that the expected traces helped speed up learning compared to the baseline which uses accumulating traces, for various step sizes. ... Bellemare, M. G.; Naddaf, Y.; Veness, J.; and Bowling, M. 2013. The Arcade Learning Environment: An Evaluation Platform for General Agents. J. Artif. Intell. Res. (JAIR) 47: 253 279. |
| Dataset Splits | No | The paper does not provide specific details regarding dataset splits for training, validation, or testing used in its experiments. |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for running the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software like JAX and Haiku, but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | We found that being able to track this rather quickly improved performance: the expected trace parameters Θ in the following experiment were updated with a relatively high step size of β = 0.1. ... All results are for λ = 0.95. Further implementation details and hyper-parameters are in the appendix. |