Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Modeling and Planning with Macro-Actions in Decentralized POMDPs

Authors: Christopher Amato, George Konidaris, Leslie P. Kaelbling, Jonathan P. How

JAIR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test the performance of our macro-action-based algorithms in simulation, using existing benchmarks, a larger domain, and in a novel multi-robot warehousing domain. For the simulation experiments, we test on a common Dec-POMDP benchmark, a four agent extension of this benchmark, and a large problem inspired by robot navigation. Our algorithms were run on a single core 2.5 GHz machine with 8GB of memory. For option-based MBDP (O-MBDP), heuristic policies for the desired lengths were generated by producing 1000 random policies and keeping the joint policy with the highest value at the initial state. Sampling was used (10000 simulations) to determine if a policy will terminate before the horizon of interest.
Researcher Affiliation	Academia	Christopher Amato EMAIL Khoury College of Computer Sciences, Northeastern University Boston, MA 02115 USA; George Konidaris EMAIL Department of Computer Science, Brown University Providence, RI 02912 USA; Leslie P. Kaelbling EMAIL MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA 02139 USA; Jonathan P. How EMAIL MIT Laboratory for Information and Decision Systems Cambridge, MA 02139 USA
Pseudocode	Yes	Algorithm 1 Option-based dynamic programming (O-DP); Algorithm 2 Option-based memory bounded dynamic programming (O-MBDP); Algorithm 3 Option-based direct cross entropy policy search (O-DICE)
Open Source Code	No	The paper does not provide an explicit statement about releasing their source code, nor does it include a link to a code repository. It mentions SMACH (Bohren, 2010) and ROS (Quigley, Conley, Gerkey, Faust, Foote, Leibs, Wheeler, & Ng, 2009) as tools, and provides a link to a video demonstrating results, but not to their implementation code.
Open Datasets	No	The paper refers to standard benchmarks such as the "meeting-in-a-grid problem" and "two-agent version of the problem of robots navigating among movable obstacles (Stilman & Kuffner, 2005)", and also uses a custom-built "multi-robot warehousing domain". However, it does not provide specific links, DOIs, repository names, or formal citations with author names and year for accessing these problem instances or any datasets used.
Dataset Splits	No	The paper describes problem setups and benchmarks, but it does not specify any training, testing, or validation dataset splits. For example, it mentions randomly initialized starting locations for agents in the grid problem but doesn't detail any data partitioning methodology.
Hardware Specification	Yes	Our algorithms were run on a single core 2.5 GHz machine with 8GB of memory.
Software Dependencies	No	The paper mentions SMACH (Bohren, 2010) and ROS (Quigley, Conley, Gerkey, Faust, Foote, Leibs, Wheeler, & Ng, 2009) as tools used. However, it does not provide specific version numbers for any libraries, programming languages, or solvers directly used in the implementation of their algorithms.
Experiment Setup	Yes	For option-based MBDP (O-MBDP), heuristic policies for the desired lengths were generated by producing 1000 random policies and keeping the joint policy with the highest value at the initial state. Sampling was used (10000 simulations) to determine if a policy will terminate before the horizon of interest. For O-MBDP, max Trees = 3, which was chosen to balance solution quality and running time. For O-DICE, Iter = 100, N = 10, Nb = 5, and α = 0.1, which were chosen based on suggestions from the original work (Oliehoek et al., 2008).