DyETC: Dynamic Electronic Toll Collection for Traffic Congestion Alleviation

Authors: Haipeng Chen, Bo An, Guni Sharon, Josiah Hanna, Peter Stone, Chunyan Miao, Yeng Soh

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that, compared with existing ETC schemes, Dy ETC increases traffic volume by around 8%, and reduces travel time by around 14.6% during rush hour. Third, we conduct extensive experimental evaluations to compare our proposed method with existing policy gradient methods as well as current tolling schemes.
Researcher Affiliation Academia Nanyang Technological University 2 University of Texas at Austin {chen0939,boan,ascymiao,ecsoh}@ntu.edu.sg gunisharon@gmail.com {jphanna,pstone}@cs.utexas.edu
Pseudocode Yes Algorithm 1: PG-β
Open Source Code No The paper does not provide any links to open-source code for their methodology nor does it explicitly state that code will be released or is available.
Open Datasets No The paper mentions that
Dataset Splits No The paper states, “The number of episodes for training is 50,000 and the number of episodes for validation is 10,000.” However, this refers to the number of reinforcement learning episodes for training/validation, not a fixed dataset split in percentages or sample counts. The data is simulated based on road network and demand parameters, not from a predefined dataset with explicit train/validation splits.
Hardware Specification Yes All algorithms are implemented using Java, all computations are performed on a 64-bit machine with 16 GB RAM and a quad-core Intel i7-4770 3.4 GHz processor.
Software Dependencies No All algorithms are implemented using Java
Experiment Setup Yes The learning rates of the value function and policy function are hand-tuned as 10 7 and 10 10, respectively. The discount factor γ is set as 1, which assigns same weights to rewards of different time periods in the finite time horizon H. The number of episodes for training is 50,000 and the number of episodes for validation is 10,000. The number of episodes for training PG-β is 500, 000, and the learning rates for the value and policy functions are fine-tuned as 10 8 and 10 12, respectively.