Optimizing Quantiles in Preference-Based Markov Decision Processes
Authors: Hugo Gilbert, Paul Weng, Yan Xu
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we provide the results of experiments testing our algorithm in a variety of settings. and 5 Experimental Results We experimentally evaluated our approach on a server equipped with four Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz and 64Gb of RAM. |
| Researcher Affiliation | Academia | Hugo Gilbert Sorbonne Universit es, UPMC Univ Paris 06, CNRS, LIP6 UMR 7606, Paris, France hugo.gilbert@lip6.fr Paul Weng, Yan Xu SYSU-CMU Joint Institute of Engineering, Guangzhou, China School of Electronics and Information Technology, Guangzhou, China SYSU-CMU Shunde International Joint Research Institute, Shunde, China {paweng,xuyan}@cmu.edu |
| Pseudocode | Yes | Algorithm 1: Binary Search for the Lower Quantile (resp. Upper Quantile) and Algorithm 2: Functional Backward Induction |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-sourcing the code for the described methodology. |
| Open Datasets | No | The paper uses generated random MDPs (Garnets) and a data center control problem model, but does not provide access information or citations for a publicly available or open dataset. |
| Dataset Splits | No | The paper describes experiments on randomly generated MDPs and a data center model, but does not provide specific details on training, validation, or test dataset splits. |
| Hardware Specification | Yes | We experimentally evaluated our approach on a server equipped with four Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz and 64Gb of RAM. |
| Software Dependencies | No | The paper states 'The algorithms were implemented in Matlab' but does not provide specific version numbers for Matlab or any other software dependencies. |
| Experiment Setup | Yes | The horizon of the problem was set to 5. The action represents the number of servers that will be on at the next time step. We assume for simplicity that the maximum number of jobs that can arrive at one timestep is three times the total number of servers. ... we optimize the 0.1-quantile with ε = 0.001 in binary search. |