reproducibilityindex.ai

NondBREM: Nondeterministic Offline Reinforcement Learning for Large-Scale Order Dispatching

Authors: Hongbo Zhang, Guang Wang, Xu Wang, Zhengyang Zhou, Chen Zhang, Zheng Dong, Yang Wang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on large-scale real-world ride-hailing datasets show the superiority of our design.
Researcher Affiliation	Academia	1University of Science and Technology of China 2Florida State University 3Wayne State University
Pseudocode	Yes	Algorithm 1: Nond BCQ
Open Source Code	No	The paper does not provide an explicit statement about open-sourcing the code or a link to a code repository for the methodology described.
Open Datasets	No	We evaluate our algorithm using real-world ride-hailing data from a large city, over a period of eight weeks. Two types of data are utilized, including vehicle GPS data and more than 20 million order data from over 50K vehicles. The dataset spans from 09/2021 to 11/2021. The paper mentions 'Data for Social Good initiatives also make them available for research' but does not provide specific access information (link, DOI, formal citation) for the dataset used in this study.
Dataset Splits	No	We use the data from the ﬁrst 6 weeks for training the model, and the data from the last 2 weeks are loaded into the simulator for performance evaluation. The paper specifies training and test data but does not mention a distinct validation set or its split.
Hardware Specification	Yes	Our experiment is implemented in Python with Tensor Flow 1.15, and executed under the environment with a CPU as Intel(R) Xeon(R) E5-2620 v4 @ 2.10GHz and one GPU as Nvidia Tesla V100 16GB.
Software Dependencies	Yes	Our experiment is implemented in Python with Tensor Flow 1.15
Experiment Setup	Yes	The tuned hyperparameters are set as follows. γ = 0.95, τ = 1, λ = 0.75, β = min(max(n /n, 0.9, 1))