Expectation Optimization with Probabilistic Guarantees in POMDPs with Discounted-Sum Objectives

Authors: Krishnendu Chatterjee, Adrián Elgyütt, Petr Novotný, Owen Rouillé

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present experimental results of our algorithm on classical POMDPs.
Researcher Affiliation Academia 1 Institute of Science and Technology Austria, Klosterneuburg, Austria 2 Ecole Normale Sup erieure de Rennes, Rennes, France
Pseudocode Yes Algorithm 1: RAMCP. ... Algorithm 2: RAMCP simulations. ... Algorithm 3: RAMCP: action selection and play.
Open Source Code Yes Our implementation and the benchmarks are available on-line.1 https://git.ist.ac.at/petr.novotny/RAMCP-public
Open Datasets Yes The first two are the classical Tiger [Kaelbling et al., 1998] and Hallway [Smith and Simmons, 2004] benchmarks naturally modified to contain a risk taking aspect.
Dataset Splits No The paper mentions using classical benchmarks but does not specify training, validation, or test dataset splits in terms of percentages or counts.
Hardware Specification Yes The test configuration was CPU: Intel-i5-3470, 3.20GHz, 4 cores; 8GB RAM; OS: Linux Mint 18 64-bit.
Software Dependencies No The paper mentions "AI-Toolbox [AI-Toolbox, 2017]" but does not provide specific version numbers for this or any other software libraries or dependencies used for their implementation.
Experiment Setup Yes In each execution, we used a timeout of 5 seconds in the first decision step and 0.1 seconds for the remaining steps. We set the exploration constant to 2 X, where X is the difference between largest and smallest undiscounted payoffs achievable in a given instance.