reproducibilityindex.ai

MetaLight: Value-Based Meta-Reinforcement Learning for Traffic Signal Control

Authors: Xinshi Zang, Huaxiu Yao, Guanjie Zheng, Nan Xu, Kai Xu, Zhenhui Li1153-1160

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experiments on four real-world datasets show that our proposed Meta Light not only adapts more quickly and stably in new trafﬁc scenarios, but also achieves better performance.
Researcher Affiliation	Collaboration	Xinshi Zang,1 Huaxiu Yao,2 Guanjie Zheng,2 Nan Xu,1 Kai Xu,3 Zhenhui Li2 1Shanghai Jiao Tong University, 2Pennsylvania State University, 3Shanghai Tianrang Intelligent Technology Co., Ltd
Pseudocode	Yes	Algorithm 1: Meta-training process of Meta Light
Open Source Code	Yes	Codes are provided at https://trafﬁc-signal-control.github.io/
Open Datasets	Yes	We use four real-world datasets from two cities in China: Jinan (JN) and Hangzhou (HZ), and two cities in the United States: Atlanta (AT), and Los Angeles (LA). ... The other raw data from American cities is composed of the full vehicle trajectories which are collected by several video cameras along the streets3. 3https://ops.fhwa.dot.gov/trafﬁcanalysistools/ngsim.htm
Dataset Splits	No	The paper describes training and testing sets but does not provide explicit details about a separate validation dataset or its split percentages.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	No	The paper mentions using a simulation platform called 'City Flow' but does not provide its specific version number, nor does it list versions for other software dependencies.
Experiment Setup	Yes	In Meta Light, the base model, FRAP++ shares the similar network structure with FRAP (Zheng et al. 2019a), except for the average operation in the embedding layers. The learning rates of learner and meta-learner are set as 0.001 for Meta Light and MAML in both meta-training and meta-testing. The episode length for all scenarios is 3600 seconds and the interval of each interaction between simulator and RL agent is 10 seconds. For Meta Light, the learner conducts model updating after each interaction using 30 samples and only one epoch for training. Meta-learner updates itself at intervals of ten times of learners updating. For MAML , the learner ﬁrst undertakes one centralized updating at the end of each episode with 1000 samples and 100 epochs for training. Then, the meta-learner updates itself using new episodes each time.