Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimizing Quantiles in Preference-Based Markov Decision Processes

Authors: Hugo Gilbert, Paul Weng, Yan Xu

AAAI 2017 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we provide the results of experiments testing our algorithm in a variety of settings. and 5 Experimental Results We experimentally evaluated our approach on a server equipped with four Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz and 64Gb of RAM.
Researcher Affiliation Academia Hugo Gilbert Sorbonne Universit es, UPMC Univ Paris 06, CNRS, LIP6 UMR 7606, Paris, France EMAIL Paul Weng, Yan Xu SYSU-CMU Joint Institute of Engineering, Guangzhou, China School of Electronics and Information Technology, Guangzhou, China SYSU-CMU Shunde International Joint Research Institute, Shunde, China EMAIL
Pseudocode Yes Algorithm 1: Binary Search for the Lower Quantile (resp. Upper Quantile) and Algorithm 2: Functional Backward Induction
Open Source Code No The paper does not provide any explicit statement or link for open-sourcing the code for the described methodology.
Open Datasets No The paper uses generated random MDPs (Garnets) and a data center control problem model, but does not provide access information or citations for a publicly available or open dataset.
Dataset Splits No The paper describes experiments on randomly generated MDPs and a data center model, but does not provide specific details on training, validation, or test dataset splits.
Hardware Specification Yes We experimentally evaluated our approach on a server equipped with four Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz and 64Gb of RAM.
Software Dependencies No The paper states 'The algorithms were implemented in Matlab' but does not provide specific version numbers for Matlab or any other software dependencies.
Experiment Setup Yes The horizon of the problem was set to 5. The action represents the number of servers that will be on at the next time step. We assume for simplicity that the maximum number of jobs that can arrive at one timestep is three times the total number of servers. ... we optimize the 0.1-quantile with ε = 0.001 in binary search.