Target Surveillance in Adversarial Environments Using POMDPs

Authors: Maxim Egorov, Mykel Kochenderfer, Jaak Uudmae

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The approach is empirically evaluated against a MOMDP adversary and against a human opponent in a target surveillance computer game. The empirical results demonstrate that, on average, level 3 MOMDP policies outperform lower level reasoning policies as well as human players.
Researcher Affiliation Academia Maxim Egorov and Mykel J. Kochenderfer Department of Aeronautics and Astronautics Stanford University Stanford, California 94305 {megorov, mykel}@stanford.edu Jaak J. Uudmae Department of Computer Science Stanford University Stanford, California 94305 juudmae@stanford.edu
Pseudocode No The paper describes the problem formulation and models but does not provide any structured pseudocode or algorithm blocks.
Open Source Code Yes The source code for this work can be found at https://github.com/sisl/TargetSurveillance.
Open Datasets No The paper evaluates its approach on three different urban environments (Map A, Map B, Map C) that were generated, and initial conditions were set using rejection sampling. It does not use or provide access to a pre-existing publicly available dataset.
Dataset Splits No The paper conducts simulations and empirical evaluations on generated maps, running '500 simulations' and for '100 time-steps', but it does not specify traditional train/validation/test dataset splits as it generates its own environments and data for evaluation.
Hardware Specification No The paper does not provide any specific details regarding the hardware used to run its experiments, such as CPU or GPU models, or memory specifications.
Software Dependencies No The paper mentions using QMDP and SARSOP as solvers and discusses POMDPs and MOMDPs, but it does not specify any software names with version numbers for implementation dependencies (e.g., Python version, library versions).
Experiment Setup Yes The initial conditions for each simulation were set using rejection sampling. Each simulation ran for 100 time-steps or until the Red Team landed a ballistic hit on the Blue Team. The levels of the Blue and Red Teams were chosen to be three and two respectively. In this work, the stochasticity constant was chosen to be μ = 0.4. Model parameters for both the adversary and the agent are given in Table 1.