Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Multi-Environment POMDPs: Discrete Model Uncertainty Under Partial Observability
Authors: Eline M. Bovy, Caleb Probine, Marnix Suilen, Ufuk Topcu, Nils Jansen
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that we can compute policies for standard POMDP benchmarks extended to the multi-environment setting. ...We study ME-POMDPs and devise algorithms to compute robust policies against any adversarial choice of POMDP in the ME-POMDP. ...6 Experimental Evaluation The implementation of the LPs (3) and (4) along with AB-HSVI (Algorithm 1) forms a solution method for ME-POMDPs, and we answer the following research questions regarding this method. (Q1) Scalability: What is the computational cost of solving AB-POMDPs? (Q2) Baseline comparison: What is the added difficulty of robustness against adversarial beliefs compared to a naive baseline of solving individual POMDPs? (Q3) Model formulation: Does the model type, i.e., whether the problem is formulated as a ME-POMDP, PO-MEMDP, MO-POMDP or AB-POMDP, influence the performance? As no benchmarks exist for ME-POMDPs, we introduce two benchmarks for our experimental evaluation. ...Tables 1 and 2 show the results of running AB-HSVI on the Bird problem and Rock Sample. |
| Researcher Affiliation | Collaboration | Eline M. Bovy Radboud University Nijmegen, The Netherlands EMAIL Caleb Probine The University of Texas at Austin Austin, TX, USA EMAIL Marnix Suilen University of Antwerp Flanders Make Antwerp, Belgium EMAIL Ufuk Topcu The University of Texas at Austin Austin, TX, USA EMAIL Nils Jansen Ruhr-University Bochum & Radboud University Bochum, Germany & Nijmegen, The Netherlands EMAIL |
| Pseudocode | Yes | Algorithm 1 AB-HSVI Algorithm 2 Extracting policies from α-vectors. Algorithm 3 Extracting policies from α-vectors with pruning. |
| Open Source Code | Yes | All code is available at [6]. [6] Eline M. Bovy, Caleb Probine, Marnix Suilen, Ufuk Topcu, and Nils Jansen. Code for the AB-HSVI algorithm and the experiments in the paper: "Multi-environment POMDPs: Discrete model uncertainty under partial observability" (Neur IPS 2025), 2025. URL https://doi. org/10.5281/zenodo.17425571. |
| Open Datasets | Yes | As no benchmarks exist for ME-POMDPs, we introduce two benchmarks for our experimental evaluation. The first benchmark is based on the endangered bird preservation case study presented in Appendix B, which we shall refer to as the Bird problem. ...For the second benchmark, we extend Rock Sample [40] to ME-POMDPs. ...All code is available at [6]. [6] Eline M. Bovy, Caleb Probine, Marnix Suilen, Ufuk Topcu, and Nils Jansen. Code for the AB-HSVI algorithm and the experiments in the paper: "Multi-environment POMDPs: Discrete model uncertainty under partial observability" (Neur IPS 2025), 2025. URL https://doi. org/10.5281/zenodo.17425571. |
| Dataset Splits | No | The paper introduces new benchmarks ( |
| Hardware Specification | Yes | We run experiments on a computer with an Intel Core i9-10980XE 3.00GHz processor and 256GB of RAM. |
| Software Dependencies | Yes | We use Gurobi [18] to solve LPs. [18] Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2024. URL https://www.gurobi.com. |
| Experiment Setup | Yes | We set a time limit tl of 3600 seconds, discount factor γ = 0.95, and set HSVI s gap threshold to ϵ = 0.1 Rmin where Rmin is the minimum problem reward. |