Optimizing Quantiles in Preference-Based Markov Decision Processes

Authors: Hugo Gilbert, Paul Weng, Yan Xu

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we provide the results of experiments testing our algorithm in a variety of settings. and 5 Experimental Results We experimentally evaluated our approach on a server equipped with four Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz and 64Gb of RAM.
Researcher Affiliation Academia Hugo Gilbert Sorbonne Universit es, UPMC Univ Paris 06, CNRS, LIP6 UMR 7606, Paris, France hugo.gilbert@lip6.fr Paul Weng, Yan Xu SYSU-CMU Joint Institute of Engineering, Guangzhou, China School of Electronics and Information Technology, Guangzhou, China SYSU-CMU Shunde International Joint Research Institute, Shunde, China {paweng,xuyan}@cmu.edu
Pseudocode Yes Algorithm 1: Binary Search for the Lower Quantile (resp. Upper Quantile) and Algorithm 2: Functional Backward Induction
Open Source Code No The paper does not provide any explicit statement or link for open-sourcing the code for the described methodology.
Open Datasets No The paper uses generated random MDPs (Garnets) and a data center control problem model, but does not provide access information or citations for a publicly available or open dataset.
Dataset Splits No The paper describes experiments on randomly generated MDPs and a data center model, but does not provide specific details on training, validation, or test dataset splits.
Hardware Specification Yes We experimentally evaluated our approach on a server equipped with four Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz and 64Gb of RAM.
Software Dependencies No The paper states 'The algorithms were implemented in Matlab' but does not provide specific version numbers for Matlab or any other software dependencies.
Experiment Setup Yes The horizon of the problem was set to 5. The action represents the number of servers that will be on at the next time step. We assume for simplicity that the maximum number of jobs that can arrive at one timestep is three times the total number of servers. ... we optimize the 0.1-quantile with ε = 0.001 in binary search.