reproducibilityindex.ai

Robust $\phi$-Divergence MDPs

Authors: Chin Pang Ho, Marek Petrik, Wolfram Wiesemann

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our fast suite of algorithms with the state-of-the-art solver MOSEK 9.3 [3] (commercial) and the first-order method of [14]. All experiments are implemented in C++, and they are run on a 3.6 GHz 8-Core Intel Core i9 CPU with 32 GB 2667 MHz DDR4 main memory. The source code is available at https://sites.google.com/view/clint-chin-pang-ho. Tables 2 4 report average computation times over 50 randomly generated test instances for the KL-divergence and the χ2-distance based ambiguity sets and show that the proposed algorithms outperform other methods. The tables reveal that our algorithms are about two orders of magnitude faster than MOSEK in solving the projection problem (5).
Researcher Affiliation	Academia	Chin Pang Ho City University of Hong Kong clint.ho@cityu.edu.hk Marek Petrik University of New Hampshire mpetrik@cs.unh.edu Wolfram Wiesemann Imperial College London ww@imperial.ac.uk
Pseudocode	No	No pseudocode or algorithm blocks were found in the main text of the paper.
Open Source Code	Yes	The source code is available at https://sites.google.com/view/clint-chin-pang-ho.
Open Datasets	No	For our experiments, we synthetically generate random RMDP instances as follows. For the projection problem, we sample each component of b uniformly at random between 0 and 1. Similarly, we sample each component of psa uniformly at random between 0 and 1 and subsequently scale psa so that its elements sum up to 1. The parameter β, finally, is uniformly distributed between min{b} + 10 8 and p sab 10 8 to adhere to the assumptions of our paper. For the robust Bellman update, all vectors bsa and all transition probabilities psa, s S and a A, are generated according to the above procedure. The parameter κ is also sampled from a uniform distribution supported on [0, 1].
Dataset Splits	Yes	For our experiments, we synthetically generate random RMDP instances as follows. For the projection problem, we sample each component of b uniformly at random between 0 and 1. Similarly, we sample each component of psa uniformly at random between 0 and 1 and subsequently scale psa so that its elements sum up to 1. The parameter β, finally, is uniformly distributed between min{b} + 10 8 and p sab 10 8 to adhere to the assumptions of our paper. For the robust Bellman update, all vectors bsa and all transition probabilities psa, s S and a A, are generated according to the above procedure. The parameter κ is also sampled from a uniform distribution supported on [0, 1].
Hardware Specification	Yes	All experiments are implemented in C++, and they are run on a 3.6 GHz 8-Core Intel Core i9 CPU with 32 GB 2667 MHz DDR4 main memory.
Software Dependencies	Yes	We compare our fast suite of algorithms with the state-of-the-art solver MOSEK 9.3 [3] (commercial) and the first-order method of [14]. All experiments are implemented in C++...
Experiment Setup	Yes	For our experiments, we synthetically generate random RMDP instances as follows. For the projection problem, we sample each component of b uniformly at random between 0 and 1. Similarly, we sample each component of psa uniformly at random between 0 and 1 and subsequently scale psa so that its elements sum up to 1. The parameter β, finally, is uniformly distributed between min{b} + 10 8 and p sab 10 8 to adhere to the assumptions of our paper. For the robust Bellman update, all vectors bsa and all transition probabilities psa, s S and a A, are generated according to the above procedure. The parameter κ is also sampled from a uniform distribution supported on [0, 1].