Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Reference-Based POMDPs

Authors: Edward Kim, Yohan Karunanayake, Hanna Kurniawati

NeurIPS 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on long-horizon 2D and 3D navigation problems are presented in Section 6 and indicate that our solver can be employed to substantially outperform POMCP [21].
Researcher Affiliation	Academia	Edward Kim School of Computing Australian National University Canberra, Australia EMAIL Yohan Karunanayake School of Computing Australian National University Canberra, Australia EMAIL Hanna Kurniawati School of Computing Australian National University Canberra, Australia EMAIL
Pseudocode	Yes	Pseudo code for an implemented algorithm where the reference policy is constructed using a fully-observed policy πFO is presented in the Appendix Section A.6 Algorithm 1.
Open Source Code	Yes	We include the source code for REFSOLVER, which is developed on top of pomdp_py as a Supplementary Material. https://github.com/RDLLab/ref_pomdp_neurips23
Open Datasets	No	The paper uses simulated environments and generated maps for its experiments (e.g., 'The environment is a 60 60 static gridworld populated...', 'we generated 64 different initial maps with randomly placed obstacles and landmarks'). It does not provide access information or citations to any publicly available or open datasets.
Dataset Splits	No	The paper conducts experiments in simulated environments and does not specify training, validation, or test dataset splits. Performance is evaluated directly by running the proposed solver and baselines in these environments.
Hardware Specification	Yes	All experiments were performed on a desktop computer with an 8 Core Intel Xeon Silver 4110 Processor and 128GB DDR4 RAM.
Software Dependencies	No	The paper mentions using the 'pomdp_py' library, but it does not specify a version number for this or any other software dependency required for replication.
Experiment Setup	Yes	POMCP was run with an exploration constant of 300 and maximum depth of 180 this maximum depth is the upper bound for (tree depth + rollout steps). For REFSOLVER, the maximum tree depth was 90, the maximum rollout depth was 180 and α = 0.5.