Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Approximate Value Equivalence

Authors: Christopher Grimm, Andre Barreto, Satinder Singh

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In contrast to previous works, we show empirically that there are situations where agents with limited capacity should prefer to learn more accurate models with respect to smaller sets of functions over less accurate models with respect to larger sets of functions. ... To illustrate situations where this might occur, we consider the tabular Four Rooms domain [Sutton et al., 1999] and learn tabular VE models whose per-action transition matrices are constrained to have rank at most R. ... Figure 2 shows a histogram of the planning performance of such models. Each cell in the figure corresponds to a model associated with specific values of R and D, and the cells color denotes the value of the model s optimal policy averaged over states and over 10 independent executions.
Researcher Affiliation	Collaboration	Christopher Grimm Computer Science & Engineering University of Michigan EMAIL André Barreto Deep Mind EMAIL Satinder Singh Deep Mind EMAIL
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper states in section 3.a of the ethics checklist: 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]'. There is no other explicit statement about open-sourcing code for the methodology.
Open Datasets	Yes	To illustrate situations where this might occur, we consider the tabular Four Rooms domain [Sutton et al., 1999]
Dataset Splits	No	The paper does not explicitly state training, validation, or test dataset splits. The ethics checklist 3.b indicates N/A for training details including data splits.
Hardware Specification	No	The paper does not explicitly describe the hardware used to run its experiments. The ethics checklist 3.d indicates N/A for compute resources.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies. The ethics checklist 3.d indicates N/A for compute resources, which often includes software details.
Experiment Setup	Yes	To illustrate situations where this might occur, we consider the tabular Four Rooms domain [Sutton et al., 1999] and learn tabular VE models whose per-action transition matrices are constrained to have rank at most R. We learn these models to be in the VE class M(Π, V), where V is a set of D functions generated by sampling v(s) Uniform( 10, 10) for each v V and each s S. ... Each cell in the figure corresponds to a model associated with specific values of R and D, and the cells color denotes the value of the model s optimal policy averaged over states and over 10 independent executions.