AttendLight: Universal Attention-Based Reinforcement Learning Model for Traffic Signal Control
Authors: Afshin Oroojlooy, Mohammadreza Nazari, Davood Hajinezhad, Jorge Silva
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments were conducted with both synthetic and real-world standard benchmark data-sets. Our numerical experiment covers intersections with three or four approaching roads; one-directional/bi-directional roads with one, two, and three lanes; different number of phases; and different traffic flows. We consider two regimes: (i) single-environment training, single-deployment, and (ii) multi-environment training, multi-deployment. Attend Light outperforms both classical and other RL-based approaches on all cases in both regimes. |
| Researcher Affiliation | Industry | Afshin Oroojlooy SAS Institute Inc. Cary, NC 27513 afshin.oroojlooy@sas.com Mohammadreza Nazari SAS Institute Inc. Cary, NC 27513 reza.nazari@sas.com Davood Hajinezhad SAS Institute Inc. Cary, NC 27513 davood.hajinezhad@sas.com Jorge Silva SAS Institute Inc. Cary, NC 27513 jorge.silva@sas.com |
| Pseudocode | No | The paper describes the REINFORCE algorithm in Appendix A.3 but does not provide a pseudocode block or a clearly labeled algorithm section with structured steps. |
| Open Source Code | No | The paper does not provide an explicit statement or a direct link to the source code for the Attend Light methodology described in the paper. Links provided in references are for third-party baseline implementations. |
| Open Datasets | Yes | To train and test Attend Light, a combination of real-world and synthetic traffic-data is utilized. For 4-way intersections with two lanes, we use the real-world traffic-data of intersections in Hangzhou and Atlanta [33, 37]. (References [1] and [2] provide GitHub links to these datasets). |
| Dataset Splits | Yes | We divide the set of intersection instances into two segments: training and testing sets, each with 42 and 70 instances, respectively. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., CPU, GPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions using "City Flow [36]" for simulation but does not provide specific version numbers for CityFlow or any other key software dependencies. |
| Experiment Setup | Yes | To clear the intersection, the green light is followed by 5 seconds of yellow light. For each intersection, we run the planning for the next 600 seconds with a minimum active time of 10 seconds for each phase. To train the Attend Light, we follow two regimes for the intersection sampling process: (i) train for a single environment and deploy on the same environment... and (ii) train on multiple environments and deploy on multiple environments... we trained five models with different random seeds and always report the average statistics. ...after 200 training episodes (instead of 100,000 episodes when trained from scratch). |