Expectation Optimization with Probabilistic Guarantees in POMDPs with Discounted-Sum Objectives
Authors: Krishnendu Chatterjee, Adrián Elgyütt, Petr Novotný, Owen Rouillé
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experimental results of our algorithm on classical POMDPs. |
| Researcher Affiliation | Academia | 1 Institute of Science and Technology Austria, Klosterneuburg, Austria 2 Ecole Normale Sup erieure de Rennes, Rennes, France |
| Pseudocode | Yes | Algorithm 1: RAMCP. ... Algorithm 2: RAMCP simulations. ... Algorithm 3: RAMCP: action selection and play. |
| Open Source Code | Yes | Our implementation and the benchmarks are available on-line.1 https://git.ist.ac.at/petr.novotny/RAMCP-public |
| Open Datasets | Yes | The first two are the classical Tiger [Kaelbling et al., 1998] and Hallway [Smith and Simmons, 2004] benchmarks naturally modified to contain a risk taking aspect. |
| Dataset Splits | No | The paper mentions using classical benchmarks but does not specify training, validation, or test dataset splits in terms of percentages or counts. |
| Hardware Specification | Yes | The test configuration was CPU: Intel-i5-3470, 3.20GHz, 4 cores; 8GB RAM; OS: Linux Mint 18 64-bit. |
| Software Dependencies | No | The paper mentions "AI-Toolbox [AI-Toolbox, 2017]" but does not provide specific version numbers for this or any other software libraries or dependencies used for their implementation. |
| Experiment Setup | Yes | In each execution, we used a timeout of 5 seconds in the first decision step and 0.1 seconds for the remaining steps. We set the exploration constant to 2 X, where X is the difference between largest and smallest undiscounted payoffs achievable in a given instance. |