Better Be Lucky than Good: Exceeding Expectations in MDP Evaluation
Authors: Thomas Keller, Florian Geißer
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical evaluation (where only π is executed) shows that the Sec P is a strategy that improves over V significantly nonetheless. To evaluate our algorithms empirically, we perform experiments on the domains of IPPC 2011 and 2014. |
| Researcher Affiliation | Academia | Thomas Keller and Florian Geißer University of Freiburg Freiburg, Germany {tkeller,geisserf}@informatik.uni-freiburg.de |
| Pseudocode | Yes | Algorithm 1: Mixed Strategy for MDP-ESPu k with u > k compute mixed strategy(u, k, m): |
| Open Source Code | No | The paper does not provide a direct statement or link for the open-source code of their methodology. It mentions using PROST and UCT, but not making their own implementation publicly available. |
| Open Datasets | No | The paper mentions using "domains of IPPC 2011 and 2014" but does not provide specific access information (link, DOI, citation with authors/year) for these datasets, which are required for a 'Yes' classification. |
| Dataset Splits | No | The paper describes the number of evaluation runs (k=30, u up to 10000) and that experiments are conducted 20 times. However, it does not specify train/validation/test dataset splits in the typical machine learning sense. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using "UCT algorithm (Keller and Helmert 2013)" and the "PROST planner (Keller and Eyerich 2012)" but does not specify version numbers for these software components, which is required for a 'Yes' classification. |
| Experiment Setup | Yes | The number of evaluation runs k is set to 30 in all experiments, which corresponds to the number of evaluation runs at both IPPC 2011 and 2014, and the values for u are increased from 30 to 10000. Each experiment is conducted 20 times and average results are reported. ... We use the UCT algorithm (Keller and Helmert 2013) to solve the base-MDP... The resulting algorithm is able to solve 34 instances of the 120 existing IPPC benchmarks... |