Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A new Q(lambda) with interim forward view and Monte Carlo equivalence

Authors: Rich Sutton, Ashique Rupam Mahmood, Doina Precup, Hado Hasselt

ICML 2014 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	The main contributions of this paper are 1) the two new off-policy algorithms PTD(λ) and PQ(λ); 2) a new forward view of these algorithms that ensures Monte Carlo equivalence at λ = 1; 3) the notion of an interim forward view and a technique for using it to derive and prove equivalence of backward-view algorithms; and 4) applications of the technique to derive and prove equivalences for PTD(λ) and PQ(λ).
Researcher Affiliation	Academia	Richard S. Sutton, A. Rupam Mahmood EMAIL Reinforcement Learning and Artiﬁcial Intelligence Laboratory, University of Alberta, Edmonton, AB T6G 2E8 Canada Doina Precup EMAIL School of Computer Science, Mc Gill University, Montr eal, QC H3A 0G4 Canada Hado van Hasselt EMAIL Reinforcement Learning and Artiﬁcial Intelligence Laboratory, University of Alberta, Edmonton, AB T6G 2E8 Canada
Pseudocode	No	The paper describes algorithms using mathematical equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statements or links regarding the availability of open-source code for the described methodology.
Open Datasets	No	The paper is theoretical and does not report on experiments or use datasets, so no information about publicly available datasets is provided.
Dataset Splits	No	The paper is theoretical and does not describe experiments or dataset splits.
Hardware Specification	No	The paper is theoretical and does not describe any specific hardware used for experiments.
Software Dependencies	No	The paper is theoretical and does not describe any specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not include details about an experimental setup or hyperparameters.