reproducibilityindex.ai

Transition-Informed Reinforcement Learning for Large-Scale Stackelberg Mean-Field Games

Authors: Pengdeng Li, Runsheng Yu, Xinrun Wang, Bo An

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on fleet management and food gathering demonstrate that our approach can scale up to 100,000 followers and significantly outperform existing baselines. (Abstract) / We evaluate our approach on two scenarios: the e-hailing driver re-positioning (EDRP) and multiple-type food gathering (MTFG). (Section 5.1)
Researcher Affiliation	Academia	Pengdeng Li1, Runsheng Yu2, Xinrun Wang1*, Bo An1 1School of Computer Science and Engineering, Nanyang Technological University, Singapore 2Hong Kong University of Science and Technology, Hong Kong, China {pengdeng.li, xinrun.wang, boan}@ntu.edu.sg, runshengyu@gmail.com
Pseudocode	No	The paper does not include a figure, block, or section explicitly labeled
Open Source Code	Yes	Code is available at https://github.com/Ipad Li/SMFG.
Open Datasets	Yes	The EDRP environment is adapted from (Lin et al. 2018). In this scenario, the leader aims to improve the order response rate (ORR) of the whole city, while the followers maximize their own returns. ... we extract order information from a public dataset of taxi trips in Manhattan, which contains for each day the time and location of all the pickups and drop-offs executed by each of 13,000 active taxis.
Dataset Splits	No	The paper mentions
Hardware Specification	Yes	All experiments are run on a 64-bit workstation with 125 GB RAM, 20 Intel i9-9820X CPU @3.30GHz processors, and 4 NVIDIA RTX2080 Ti GPUs.
Software Dependencies	No	The paper describes the algorithmic frameworks used but does not provide specific software names with version numbers for dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper provides general experimental setup information such as baselines, environments, and the number of seeds used for runs. However, it does not explicitly detail specific hyperparameters (e.g., learning rate, batch size, number of epochs) or other system-level training configurations for their models.