Perception-Aware Point-Based Value Iteration for Partially Observable Markov Decision Processes
Authors: Mahsa Ghasemi, Ufuk Topcu
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the proposed algorithm for active perception and planning, we implement the point-based value iteration solver for AP2-POMDPs. We initialize the belief set by uniform sampling from B [Devroye, 1986]. To focus on the effect of perception, we keep the belief set fixed throughout the iterations. However, one can incorporate any sampling method such as the ones proposed by Kurniawati et al. [2008], and Smith and Simmons [2012]. The α vectors are initialized by 1 1 γ mins,a R(s, a).Ones(|S|) [Shani et al., 2013]. Furthermore, to speedup the solver, one can employ a randomized backup step, as suggested by Spaan and Vlassis [2005]. The solver terminates once the difference between value functions in two consecutive iterations falls below a predefined threshold. We also implemented a random perception policy that selects a subset of information sources, uniformly at random, at each backup step.5.1 Robotic Navigation in 1-D Grid... We evaluate the computed policy by running 1000 Monte Carlo simulations. The robot starts at the origin and its initial belief is uniform. Figure 4-(a) demonstrates the discounted cumulative reward, averaged over 1000 runs, for random selection of 1 and 2 cameras, and greedy selection of 1 and 2 cameras. |
| Researcher Affiliation | Academia | Mahsa Ghasemi and Ufuk Topcu University of Texas at Austin {mahsa.ghasemi, utopcu}@utexas.edu |
| Pseudocode | Yes | Algorithm 1 Greedy policy for perception action... Algorithm 2 Generic algorithm for point-based solvers [Araya et al., 2010]... Algorithm 3 Back Up step for AP2-POMDP |
| Open Source Code | No | The paper does not provide any links to open-source code or explicitly state that code for their methodology is released. |
| Open Datasets | No | The paper describes simulation scenarios (1-D and 2-D grids) and mentions initializing belief sets and running Monte Carlo simulations. It does not refer to a publicly available pre-existing dataset that was used for training or evaluation. |
| Dataset Splits | No | The paper describes running Monte Carlo simulations to evaluate the policy but does not specify training, validation, or test dataset splits in the conventional sense (e.g., percentages or sample counts for data partitions). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper describes the algorithms and their implementation but does not list any specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | We initialize the belief set by uniform sampling from B [Devroye, 1986]. ... The α vectors are initialized by 1 1 γ mins,a R(s, a).Ones(|S|) [Shani et al., 2013]. ... The solver terminates once the difference between value functions in two consecutive iterations falls below a predefined threshold. ... The robot starts at the origin and its initial belief is uniform. ... The reward is 10 at the goal state, -4 at the obstacles, and -1 in other states. |