reproducibilityindex.ai

Fast Algorithms for $L_\infty$-constrained S-rectangular Robust MDPs

Authors: Bahram Behzadian, Marek Petrik, Chin Pang Ho

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental results conﬁrm the practical viability of our method and show that it outperforms a leading commercial optimization package by several orders of magnitude.
Researcher Affiliation	Academia	Bahram Behzadian University of New Hampshire bahram@cs.unh.edu Marek Petrik University of New Hampshire mpetrik@cs.unh.edu Chin Pang Ho City University of Hong Kong clint.ho@cityu.edu.hk
Pseudocode	Yes	Algorithm 1 Homotopy method to compute q(ξ) ... Algorithm 2 Bisection method for solving (7).
Open Source Code	No	The paper states that the algorithms are implemented in C++, but it does not provide any link or explicit statement about making the source code publicly available.
Open Datasets	Yes	As the main benchmark problem, we use the classic Inventory Management (IM) problem (Zipkin, 2000). To evaluate the performance of our methods on small problems, we also consider the River Swim (RS) domain (Strehl and Littman, 2008) and the Machine Replacement (MR) domain (Delage and Ye, 2010).
Dataset Splits	No	The paper describes experimental evaluation on standard MDP domains for runtime performance, but it does not specify training, validation, or test dataset splits as typically found in supervised machine learning contexts, as it concerns algorithm performance on a defined problem rather than model generalization from data.
Hardware Specification	Yes	The results were generated on a computer with an Intel i7-9700 CPU with 32 GB RAM; the algorithms are implemented in C++.
Software Dependencies	Yes	This section compares the empirical runtime of Algorithms 1 and 2 with the runtime of Gurobi 9.1, a leading LP solver.
Experiment Setup	Yes	The x-axis represents the number of states (maximum holding capacity) in the domain. The number of actions is the same as the number of states. We use the robustness budget κ = 1.2, but the computation time is insensitive to the particular choice of κ.