Minimax Regret Optimisation for Robust Planning in Uncertain Markov Decision Processes
Authors: Marc Rigter, Bruno Lacerda, Nick Hawes11930-11938
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on both synthetic and real-world domains, showing that it significantly outperforms existing baselines. Experiments in both synthetic and real-world domains demonstrate that our approach considerably outperforms existing baselines. |
| Researcher Affiliation | Academia | Marc Rigter, Bruno Lacerda, Nick Hawes Oxford Robotics Institute, University of Oxford, United Kingdom {mrigter, bruno, nickh}@robots.ox.ac.uk |
| Pseudocode | Yes | Algorithm 1 presents pseudocode for the minimax VI algorithm. |
| Open Source Code | No | The paper does not contain any explicit statement about making the source code for their method publicly available, nor does it provide a link to a code repository. |
| Open Datasets | Yes | For disaster rescue and medical decision making, the samples were generated using the method from (Ahmed et al. 2013, 2017). In underwater glider, ξ consisted of the 12 samples corresponding to each hourly weather forecast. The forecast used was for May 1st 2020 and is available online at https://marine.copernicus.eu/. |
| Dataset Splits | Yes | For the medical domain, each method was evaluated for 250 different randomly generated UMDPs. For the other two domains, each method was evaluated for a range of problem sizes, and each problem size was repeated for 25 different randomly generated UMDPs. For each disaster rescue and medical decision making UMDP, ξ consisted of 15 samples selected using the method from (Ahmed et al. 2013, 2017). In underwater glider, ξ consisted of the 12 samples corresponding to each hourly weather forecast. |
| Hardware Specification | Yes | Computation times are reported for a 3.2 GHz Intel i7 processor. |
| Software Dependencies | No | The paper states 'MILPs are solved using Gurobi, and all other processing is performed in Python.' However, it does not specify version numbers for either Gurobi or Python. |
| Experiment Setup | No | The paper mentions parameters like the perturbation scalar κ and the convergence threshold ϵ in Algorithm 1, but does not provide specific numerical values for these parameters. It also mentions 'action pruning' but without specific details on its configuration. |