Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Online Guidance Graph Optimization for Lifelong Multi-Agent Path Finding

Authors: Hongzhi Zang, Yulun Zhang, He Jiang, Zhe Chen, Daniel Harabor, Peter J. Stuckey, Jiaoyang Li

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments comparing online and offline guidance, optimized guidance policies versus human-designed guidance policies, and the advantages of online guidance with dynamic task distribution. We also compare the runtime of all algorithms. Then, we present results for guidance policies for GPIBT with LNS.
Researcher Affiliation	Academia	Hongzhi Zang1, Yulun Zhang2, He Jiang2, Zhe Chen3, Daniel Harabor3, Peter J. Stuckey3, Jiaoyang Li2 1 Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, 100084, China 2 Robotics Institute, Carnegie Mellon University, Pittsburgh, PA15213, USA 3 Department of Data Science and AI, Monash University, Melbourne, 3800 VIC, Australia EMAIL EMAIL EMAIL
Pseudocode	No	The paper describes methods and processes using descriptive text and flowcharts (Figure 2, Figure 3), but it does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	Code https://github.com/zanghz21/Online GGO
Open Datasets	Yes	We conduct experiments on 4 maps: (1) sortation-33-57, (2) warehouse-33-57, (3) empty-32-32, and (4) random-32-32, shown on top of Figure 4. The first two have regular patterns and are used to test MAPF algorithms in automated warehouse settings. Specifically, the sortation map is the same as in (Chen et al. 2024) and the warehouse map is generated by us. The latter two are selected from the MAPF benchmark (Stern et al. 2019).
Dataset Splits	Yes	We consider both static and dynamic task distributions. Static task distribution samples goals uniformly, the most common setting in previous works (Chen et al. 2024; Zhang et al. 2023b,a). For the dynamic task distribution, goals are sampled from the Gaussian or multimodal Gaussian distribution, where the Gaussian centers change for every 200 timesteps. Concretely, in sortation and warehouse maps, goals on endpoints are sampled from a Gaussian distribution, and goals on workstations are sampled uniformly. For empty and random maps, goals are sampled from a multi-modal Gaussian distribution with K Gaussian centers. The hyperparameters for the distribution are provided in Table 3 in Appendix A.1.
Hardware Specification	Yes	The CPU runtime for all algorithms is measured on a local machine with a 64-core AMD Ryzen Threadripper 3990X CPU, 192 GB of RAM, and an Nvidia RTX 3090Ti GPU. More compute resource information can be found in Appendix A.4.
Software Dependencies	No	The paper mentions "For relevant software libraries, see Appendix A.5." however, the content of Appendix A.5 is not provided in the text, so specific version numbers for software dependencies cannot be verified.
Experiment Setup	Yes	We run all LMAPF algorithms for N = 1, 000 timesteps. In on+PIBT and [p-on]+GPIBT, we update the guidance graph for every m = 20 timesteps. LNS parameters are set as Niter = 10, ng = 10, and t LNS = 8. CMA-ES hyperparameters are kept the same as in the main results. We include information on the selection of CMA-ES-related hyperparameters in Appendix A.1.