First-Order Methods for Wasserstein Distributionally Robust MDP
Authors: Julien Grand Clement, Christian Kroer
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments show that our algorithm is significantly more scalable than state-of-the-art approaches across several domains. |
| Researcher Affiliation | Academia | 1IEOR Department, Columbia University. |
| Pseudocode | Yes | Algorithm 1 First-order Method for Wasserstein DR-MDP |
| Open Source Code | No | The paper does not provide any specific links to source code or state that the code is publicly available. |
| Open Datasets | No | For each MDP instance, we generate the sampled kernels ˆy1, ..., ˆy N by considering N small random (Garnet) perturbations around the true nominal kernel y0 (see Appendix F). |
| Dataset Splits | No | The paper describes generating MDP instances and evaluating performance based on running times and optimality criteria, but it does not specify train/validation/test dataset splits for a fixed dataset. |
| Hardware Specification | Yes | We run our simulations on a laptop with 2.2 GHz Intel Core i7 and 8 GB of RAM. |
| Software Dependencies | Yes | We implement our algorithms in Python 3.7.3, using Gurobi 8.1.1 to solve any linear/quadratic optimization program involved. |
| Experiment Setup | Yes | All figures in this section show the running times of the algorithms before returning an ϵ-optimal policy with ϵ = 0.1. We initialize all algorithms with v0 = 0. The running times are averaged across 5 instances by changing the seeds for sampling the N kernels around y0. |