When Transfer Learning Meets Cross-City Urban Flow Prediction: Spatio-Temporal Adaptation Matters

Authors: Ziquan Fang, Dongen Wu, Lu Pan, Lu Chen, Yunjun Gao

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on five real datasets show STAN substantially outperforms state-of-the-art methods.
Researcher Affiliation Academia 1College of Computer Science, Zhejiang University, Hangzhou, China, 2College of Computer Science, Zhejiang University of Technology, Hangzhou, China
Pseudocode No The paper describes the model architecture and mathematical formulations but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating the availability of open-source code for the methodology described.
Open Datasets Yes We adopt five popular open-source urban crowd datasets, namely NYCTaxi, NYCBike, CHIBike, BJTaxi, and Chengdu, which are commonly used in related studies. Table 1 summarizes the statistics of the datasets.
Dataset Splits No The paper specifies train and test splits (e.g., '9 months of data [...] for training and the rest for testing'), but it does not explicitly mention a separate validation split for hyperparameter tuning or early stopping.
Hardware Specification Yes STAN is implemented with Pytorch framework on GTX-3090 24G GPU.
Software Dependencies No The paper mentions 'Pytorch framework' but does not provide a specific version number for PyTorch or any other software dependencies.
Experiment Setup Yes Parameter Settings. [...] Then, we select 3 days of historical data where each day contains 9 time intervals. Further, we compute inflow/outflow of each region based on Eq. 1 and then normalize the flow data into [0, 1]. In terms of SAAM, the convolution kernel size is set to 3x3, and the domain discriminator contains two fully-connected layers with sizes are 64 and 1. In terms of TAAM, the length of LSTM and hidden feature size are set to 9 and 128. In terms of PM, two connected layers sizes are 256 and 512. Then, we set β and γ as 0.1. As for model training, the epoch size, dropout, and learning rate are set to 32, 0.5, and 1e-6. Besides, after 50 epochs, we reduce the learning rate to 0.9 times the original after every five epochs.