Semi-infinitely Constrained Markov Decision Processes
Authors: Liangyu Zhang, Yang Peng, Wenhao Yang, Zhihua Zhang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also conduct extensive numerical examples to illustrate the SICMDP model and validate the SI-CRL algorithm. |
| Researcher Affiliation | Academia | Liangyu Zhang Academy of Advanced Interdisciplinary Studies Peking University zhangliangyu@pku.edu.cn Yang Peng School of Mathematical Sciences Peking University pengyang@pku.edu.cn Wenhao Yang Academy of Advanced Interdisciplinary Studies Peking University yangwenhaosms@pku.edu.cn Zhihua Zhang School of Mathematical Sciences Peking University zhzhang@math.pku.edu.cn |
| Pseudocode | Yes | Algorithm 1 SI-CRL |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Please see supplementary material. |
| Open Datasets | No | The paper mentions datasets are 'generated by a generative model' or 'by a probability measure' and describes two numerical examples 'toy SICMDP' and 'discharge of sewage'. For the latter, it says 'adapted from [18]', which is a research paper and not a dataset source. |
| Dataset Splits | No | The paper mentions controlling the size of the dataset 'm' and that the dataset is 'generated by generative models', but it does not specify explicit training, validation, or test split percentages or sample counts within the provided text. |
| Hardware Specification | Yes | All the experiments are run on a workstation with 8 CPUs and no GPU. |
| Software Dependencies | Yes | We implement our methods with Python and LP problems are solved using a full-featured university version of Gurobi [19]. |
| Experiment Setup | Yes | We set T sufficiently large such that the algorithm is guaranteed to converge. Then we gradually increase m, the size of the dataset... We set ϵ = 0.01, δ = 0.005 |S|2|A|. |