Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Modeling and Planning with Macro-Actions in Decentralized POMDPs
Authors: Christopher Amato, George Konidaris, Leslie P. Kaelbling, Jonathan P. How
JAIR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test the performance of our macro-action-based algorithms in simulation, using existing benchmarks, a larger domain, and in a novel multi-robot warehousing domain. For the simulation experiments, we test on a common Dec-POMDP benchmark, a four agent extension of this benchmark, and a large problem inspired by robot navigation. Our algorithms were run on a single core 2.5 GHz machine with 8GB of memory. For option-based MBDP (O-MBDP), heuristic policies for the desired lengths were generated by producing 1000 random policies and keeping the joint policy with the highest value at the initial state. Sampling was used (10000 simulations) to determine if a policy will terminate before the horizon of interest. |
| Researcher Affiliation | Academia | Christopher Amato EMAIL Khoury College of Computer Sciences, Northeastern University Boston, MA 02115 USA; George Konidaris EMAIL Department of Computer Science, Brown University Providence, RI 02912 USA; Leslie P. Kaelbling EMAIL MIT Computer Science and Artificial Intelligence Laboratory Cambridge, MA 02139 USA; Jonathan P. How EMAIL MIT Laboratory for Information and Decision Systems Cambridge, MA 02139 USA |
| Pseudocode | Yes | Algorithm 1 Option-based dynamic programming (O-DP); Algorithm 2 Option-based memory bounded dynamic programming (O-MBDP); Algorithm 3 Option-based direct cross entropy policy search (O-DICE) |
| Open Source Code | No | The paper does not provide an explicit statement about releasing their source code, nor does it include a link to a code repository. It mentions SMACH (Bohren, 2010) and ROS (Quigley, Conley, Gerkey, Faust, Foote, Leibs, Wheeler, & Ng, 2009) as tools, and provides a link to a video demonstrating results, but not to their implementation code. |
| Open Datasets | No | The paper refers to standard benchmarks such as the "meeting-in-a-grid problem" and "two-agent version of the problem of robots navigating among movable obstacles (Stilman & Kuffner, 2005)", and also uses a custom-built "multi-robot warehousing domain". However, it does not provide specific links, DOIs, repository names, or formal citations with author names and year for accessing these problem instances or any datasets used. |
| Dataset Splits | No | The paper describes problem setups and benchmarks, but it does not specify any training, testing, or validation dataset splits. For example, it mentions randomly initialized starting locations for agents in the grid problem but doesn't detail any data partitioning methodology. |
| Hardware Specification | Yes | Our algorithms were run on a single core 2.5 GHz machine with 8GB of memory. |
| Software Dependencies | No | The paper mentions SMACH (Bohren, 2010) and ROS (Quigley, Conley, Gerkey, Faust, Foote, Leibs, Wheeler, & Ng, 2009) as tools used. However, it does not provide specific version numbers for any libraries, programming languages, or solvers directly used in the implementation of their algorithms. |
| Experiment Setup | Yes | For option-based MBDP (O-MBDP), heuristic policies for the desired lengths were generated by producing 1000 random policies and keeping the joint policy with the highest value at the initial state. Sampling was used (10000 simulations) to determine if a policy will terminate before the horizon of interest. For O-MBDP, max Trees = 3, which was chosen to balance solution quality and running time. For O-DICE, Iter = 100, N = 10, Nb = 5, and α = 0.1, which were chosen based on suggestions from the original work (Oliehoek et al., 2008). |