Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Point-Based Value Iteration for Finite-Horizon POMDPs
Authors: Erwin Walraven, Matthijs T. J. Spaan
JAIR 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments we demonstrate that the algorithm is an effective method for solving finite-horizon POMDPs. |
| Researcher Affiliation | Academia | Erwin Walraven EMAIL Matthijs T. J. Spaan EMAIL Delft University of Technology, Van Mourik Broekmanweg 6, 2628 XE Delft, The Netherlands |
| Pseudocode | Yes | Algorithm 1: Sawtooth approximation (UB), Algorithm 2: Finite-horizon point-based Value Iteration (Fi VI), Algorithm 3: Belief expansion (expand), Algorithm 4: Perseus Belief Selection (PBS) |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | We use multiple domains from pomdp.org, which we solve with horizons h = 5, 10, 15, 20. |
| Dataset Splits | No | The paper uses POMDP domains to test the algorithms, which are problem definitions rather than datasets requiring explicit training/test/validation splits. No specific dataset split information is provided for reproducibility in terms of data partitioning for learning. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or detailed computer specifications) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers (e.g., programming languages, libraries, or external solvers) used to implement and run the described algorithms. |
| Experiment Setup | Yes | We let the algorithms run for at most 15 minutes, after which execution is terminated. Furthermore, we stop algorithm execution if the gap between the lower bound and upper bound drops below 0.01. For DBBU we consider the parameters θ = 10, 20, 30, 40. We let the algorithm sample beliefs during 1000 episodes. During our experiments we use discretization parameter D = 10. The third method we consider is the infinite-horizon algorithm Gap Min which computes an infinite-horizon policy with γ = 0.99. |