Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Doubly-Asynchronous Value Iteration: Making Value Iteration Asynchronous in Actions

Authors: Tian Tian, Kenny Young, Richard S. Sutton

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also empirically demonstrate DAVI s effectiveness in several experiments. [...] 6 Experiments [...] Figure 1 and Figure 2 show the performance of the algorithms.
Researcher Affiliation	Academia	Tian Tian Kenny Young Richard S. Sutton University of Alberta and Alberta Machine Intelligence Institute Edmonton, Alberta, Canada EMAIL
Pseudocode	Yes	Algorithm 1: DAVI(m, p, q, τ) Input: State sampling distribution p (S) Input: A potentially state conditional distribution over the sets of actions of size m denoted by q Input: Number of iterations τ, see Corollary 1 for how to choose τ to obtain an ϵ-optimal policy with high probability
Open Source Code	No	The checklist question 3(a) explicitly states: 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]'
Open Datasets	No	The paper defines the structure of the MDPs used for experiments (e.g., 'single-state MDP with 10000 actions', 'tree with a depth of 2', 'random MDP with 100 states'), but it does not provide access information (URL, DOI, repository, or citation) for publicly available datasets.
Dataset Splits	No	The paper describes its experimental setup including different MDP structures and running each instance 200 times, but it does not specify any training, validation, or test dataset splits.
Hardware Specification	No	The checklist question 3(d) explicitly states: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]'
Software Dependencies	No	The paper describes the algorithms implemented ('VI, Asynchronous VI, and DAVI') and their sampling methods, but it does not specify any particular software, libraries, or their version numbers used in the implementation.
Experiment Setup	Yes	DAVI with m = 1 was significantly different from that of DAVI with m = 10, 100, 1000, and DAVI with m = 10, 100, 1000 converged at a similar rate. [...] This experiment consists of a single-state MDP with 10000 actions, all terminate immediately. [...] The first set consists of a tree with a depth of 2. Each state has 50 actions, where each action leads to 2 other distinct next states. [...] The second set consists of a random MDP with 100 states, where each state has 1000 actions. [...] The γ in all of the MDPs are 1.