Minimax Regret Optimisation for Robust Planning in Uncertain Markov Decision Processes

Authors: Marc Rigter, Bruno Lacerda, Nick Hawes11930-11938

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on both synthetic and real-world domains, showing that it significantly outperforms existing baselines. Experiments in both synthetic and real-world domains demonstrate that our approach considerably outperforms existing baselines.
Researcher Affiliation Academia Marc Rigter, Bruno Lacerda, Nick Hawes Oxford Robotics Institute, University of Oxford, United Kingdom {mrigter, bruno, nickh}@robots.ox.ac.uk
Pseudocode Yes Algorithm 1 presents pseudocode for the minimax VI algorithm.
Open Source Code No The paper does not contain any explicit statement about making the source code for their method publicly available, nor does it provide a link to a code repository.
Open Datasets Yes For disaster rescue and medical decision making, the samples were generated using the method from (Ahmed et al. 2013, 2017). In underwater glider, ξ consisted of the 12 samples corresponding to each hourly weather forecast. The forecast used was for May 1st 2020 and is available online at https://marine.copernicus.eu/.
Dataset Splits Yes For the medical domain, each method was evaluated for 250 different randomly generated UMDPs. For the other two domains, each method was evaluated for a range of problem sizes, and each problem size was repeated for 25 different randomly generated UMDPs. For each disaster rescue and medical decision making UMDP, ξ consisted of 15 samples selected using the method from (Ahmed et al. 2013, 2017). In underwater glider, ξ consisted of the 12 samples corresponding to each hourly weather forecast.
Hardware Specification Yes Computation times are reported for a 3.2 GHz Intel i7 processor.
Software Dependencies No The paper states 'MILPs are solved using Gurobi, and all other processing is performed in Python.' However, it does not specify version numbers for either Gurobi or Python.
Experiment Setup No The paper mentions parameters like the perturbation scalar κ and the convergence threshold ϵ in Algorithm 1, but does not provide specific numerical values for these parameters. It also mentions 'action pruning' but without specific details on its configuration.