GPT-ST: Generative Pre-Training of Spatio-Temporal Graph Neural Networks

Authors: Zhonghang Li, Lianghao Xia, Yong Xu, Chao Huang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on representative benchmarks demonstrate the effectiveness of our proposed method. We conduct extensive experiments on real-world datasets, and the demonstrated improvement in the performance of diverse downstream baselines showcases the superior performance of GPT-ST.
Researcher Affiliation Academia Zhonghang Li1,2 Lianghao Xia2 Yong Xu1,3,4 Chao Huang2 1 South China University of Technology, 2 University of Hong Kong, 3 PAZHOU LAB, 4 Guangdong Key Lab of Communication and Computer Network
Pseudocode Yes A.1.2 Algorithm Process: Algorithm 1 represents the process of the adaptive mask strategy described in Section 4.3. On the other hand, Algorithm 2 illustrates the algorithmic process during the pre-training stage.
Open Source Code Yes We have made our model implementation publicly available at https://github.com/HKUDS/GPT-ST.
Open Datasets Yes To evaluate the effectiveness of our proposed method, we conduct experiments on four real-world ST datasets including: PEMS08 [32], METR-LA [23], NYC Taxi and NYC Citi Bike [46], which records the traffic flow, traffic speed, taxi order records, and bike order records, respectively.
Dataset Splits Yes We divide METR-LA dataset into training set, validation set and test set in a ratio of 7:1:2 and 6:2:2 for others datasets.
Hardware Specification Yes To ensure fairness, all experiments are conducted on a system equipped with a GTX 3090 GPU and an Intel Core i9-12900K CPU.
Software Dependencies No No specific software dependencies with version numbers are mentioned in the paper's text.
Experiment Setup Yes Following previous works [13, 32, 22], the number of time slots L is set to 12. The latent representation dimensionality (d) and the customized parameter (d ) are both set to 64 and 16, respectively. The number of hyperedges (HT , HS, HM) is set to 8, 10, and 16, respectively. We perform 2 dynamic routing iterations. Additionally, the balance ratio (λ) for Lr and Lkl is set to 0.1, and the total mask ratio (rt) is set to 0.25. The batch size for all methods, except those tagged with b8 , is set to 64 (where b8 denotes a batch size of 8).