Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-Objective POMDPs with Lexicographic Reward Preferences

Authors: Kyle Hollins Wray, Shlomo Zilberstein

IJCAI 2015 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test the algorithms using real-world road data provided by Open Street Map (OSM) within 10 major cities. Finally, we present GPU-based optimizations for point-based solvers, demonstrating that their application enables us to quickly solve vastly larger LPOMDPs and other variations of POMDPs.
Researcher Affiliation Academia Kyle Hollins Wray and Shlomo Zilberstein College of Information and Computer Sciences University of Massachusetts, Amherst, MA 01003 EMAIL
Pseudocode No The paper describes algorithms using equations and narrative, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code No Additionally, we plan to make our domain-specific tools public, in addition to both our CPU and GPU source code, to facilitate community-wide development of realistic-scale domains and algorithms for planning under partial observability.
Open Datasets Yes We test the algorithms using real-world road data provided by Open Street Map (OSM) within 10 major cities.
Dataset Splits No The paper does not provide explicit details about dataset splits for training, validation, or testing.
Hardware Specification Yes Experiments were conducted with an Intel(R) Core(TM) i7-4702HQ CPU at 2.20GHz, 8GB of RAM, and an Nvidia(R) Ge Force GTX 870M graphics card using C++ and CUDA(C) 6.5.
Software Dependencies Yes Experiments were conducted with an Intel(R) Core(TM) i7-4702HQ CPU at 2.20GHz, 8GB of RAM, and an Nvidia(R) Ge Force GTX 870M graphics card using C++ and CUDA(C) 6.5.
Experiment Setup Yes Table 1: Computation time (seconds), and the initial belief s values (negated travel time; seconds) over 10 cities for LPBVI on the CPU (h 10) and GPU (h 500) and the improvement ratio: p50 CPUq{GPU adjusted for the horizon difference. These are point-based algorithms without expansion, so each horizon step is the same operation.