Belief-State Query Policies for User-Aligned POMDPs

Authors: Daniel Bramblett, Siddharth Srivastava

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluation on a diverse set of problems showing both the efficiency of our algorithm and the quality of the computed user-aligned policies. (Sec. 7).
Researcher Affiliation Academia Daniel Bramblett and Siddharth Srivastava Autonomous Agents and Intelligent Robots Lab School of Computing and Augmented Intelligence Arizona State University, AZ, USA {drbrambl,siddharths}@asu.edu
Pseudocode Yes Algorithm 1 Partition Refinement Search (PRS)
Open Source Code Yes Complete source code is available in the supplementary material.
Open Datasets No The paper defines problems like 'Lane merger', 'Spaceship repair', 'Graph rock sample', and 'Store visit', which appear to be simulation environments rather than public datasets with specified access information.
Dataset Splits No The paper discusses evaluating policies but does not specify training, validation, and test dataset splits in the context of data partitioning.
Hardware Specification Yes All experiments were performed on an Intel(R) Xeon(R) W-2102 CPU @ 2.90GHz without using a GPU.
Software Dependencies No The paper describes the implementation using a manager-worker design pattern but does not specify versions for software dependencies like programming languages or libraries.
Experiment Setup Yes The manager maintained the hypothesized optimal partition and current exploration rate. Table 3 shows the timeout and sample rate used for each problem for PRS... For Nelder-Mead optimization, we used a simplex that had vertices numbering one more than the number of parameters... For Particle Swarm optimization, 10 particles were used with the location and momentum of each particle clipped to the search space. The coefficients changed based on steps since the last improvement.