Learning from Highly Sparse Spatio-temporal Data

Authors: Leyan Deng, Chenwang Wu, Defu Lian, Enhong Chen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed model across various downstream tasks involving highly sparse spatio-temporal data. Empirical results indicate that our model outperforms state-of-the-art imputation methods, demonstrating its effectiveness and robustness.
Researcher Affiliation Academia School of Artificial Intelligence and Data Science, University of Science and Technology of China School of Computer Science and Technology, University of Science and Technology of China {dleyan, wcw1996}@mail.ustc.edu.cn, {liandefu, cheneh}@ustc.edu.cn
Pseudocode No The paper describes its methodology using text and mathematical equations in Section 5, but it does not include a formally labeled "Algorithm" or "Pseudocode" block or figure.
Open Source Code Yes The source code and datasets are available at https://github.com/dleyan/OPCR.
Open Datasets Yes We consider three sets of spatio-temporal datasets and summarize their statistics in Table 1. ...Traffic dataset[22]... Large-scale dataset[23]... T4C22 dataset [24]...
Dataset Splits Yes We fix the maximum number of epochs to 300 and we use early stopping on the validation set with patience of 40 epochs. On the large-scale dataset, ...We fix the maximum number of epochs to 200 and we use early stopping on the validation set with patience of 10 epochs.
Hardware Specification Yes On the traffic dataset, we train all models with an RTX 3090 GPU (24GB RAM). On the large-scale dataset and the T4C22 dataset, we train them with a V100 GPU (16GB RAM).
Software Dependencies No All the baselines have been implemented in Py Torch [26]. The paper mentions PyTorch but does not specify a version number or other software dependencies with their respective versions.
Experiment Setup Yes For spatio-temporal data imputation task, we select windows of length 24. On the traffic datasets, we strictly follow the settings of SPIN4. We fix the maximum number of epochs to 300 and we use early stopping on the validation set with patience of 40 epochs. On the large-scale dataset, considering memory capacity and computational efficiency, we reduced the number of hidden states to 16 for some baselines (i.e., SPIN-H, CSDI, and Pri STI). We fix the maximum number of epochs to 200 and we use early stopping on the validation set with patience of 10 epochs. ... For congestion classification task, we train all the models for 20 epochs. For travel time prediction task, we train all the models for 50 epochs. ... As the hidden states go from 16 to 256, the overall performance slowly increases and then fluctuates slightly. Therefore, we set the number of hidden states to 128. ... The approximate best level can be reached when the number of layers is only 2.