reproducibilityindex.ai

Multi-Objective MDPs with Conditional Lexicographic Reward Preferences

Authors: Kyle Wray, Shlomo Zilberstein, Abdel-Illah Mouaddib

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The performance of LVI in practice is tested within a realistic benchmark problem in the domain of semi-autonomous driving. Section 4 presents our experimental results. We evaluated the computation times of both weighted VI, as well as LVI run on both a CPU and GPU.
Researcher Affiliation	Academia	1 School of Computer Science, University of Massachusetts, Amherst, MA, USA 2 GREYC Laboratory, University of Caen, Basse-Normandie, France
Pseudocode	Yes	Algorithm 1 Lexicographic Value Iteration (LVI)
Open Source Code	No	The paper states, "In future work, we plan to enrich the semi-autonomous driving domain and make the scenario creation tools public to facilitate further research and comparison of algorithms for multi-objective optimization." This indicates a future plan for releasing tools, not current availability of the methodology's source code.
Open Datasets	Yes	We use real-world road data from Open Street Map1 for sections of the 10 cities in Table 1. 1http://wiki.openstreetmap.org
Dataset Splits	No	The paper does not specify traditional training/validation/test dataset splits. While it uses real-world road data to define the MDP environment, the evaluation is based on the algorithm's performance within these environments, not on separate data partitions for training, validation, or testing in the machine learning sense.
Hardware Specification	Yes	Our experiments shown in Table 1 were executed with an Intel(R) Core(TM) i7-4702HQ CPU at 2.20GHz, 8GB of RAM, and an Nvidia(R) Ge Force GTX 870M graphics card using C++ and CUDA(C) 6.5.
Software Dependencies	Yes	Our experiments shown in Table 1 were executed with an Intel(R) Core(TM) i7-4702HQ CPU at 2.20GHz, 8GB of RAM, and an Nvidia(R) Ge Force GTX 870M graphics card using C++ and CUDA(C) 6.5.
Experiment Setup	Yes	LMDP states are formed by a pair of intersections (previous and current), driver tiredness (true/false), and autonomy (enabled/disabled). Actions are taken at intersections... The stochasticity within state transitions model the likelihood that the driver will drift from attentive to tired, following probability of 0.1. The time cost is proportional to the time spent on the road (in seconds), plus a small constant value of 5 (seconds)... We allow for a 10 second slack in the expected time to reach the goal, in order to favor selecting autonomy-capable roads.