reproducibilityindex.ai

Learning from Highly Sparse Spatio-temporal Data

Authors: Leyan Deng, Chenwang Wu, Defu Lian, Enhong Chen

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed model across various downstream tasks involving highly sparse spatio-temporal data. Empirical results indicate that our model outperforms state-of-the-art imputation methods, demonstrating its effectiveness and robustness.
Researcher Affiliation	Academia	School of Artificial Intelligence and Data Science, University of Science and Technology of China School of Computer Science and Technology, University of Science and Technology of China {dleyan, wcw1996}@mail.ustc.edu.cn, {liandefu, cheneh}@ustc.edu.cn
Pseudocode	No	The paper describes its methodology using text and mathematical equations in Section 5, but it does not include a formally labeled "Algorithm" or "Pseudocode" block or figure.
Open Source Code	Yes	The source code and datasets are available at https://github.com/dleyan/OPCR.
Open Datasets	Yes	We consider three sets of spatio-temporal datasets and summarize their statistics in Table 1. ...Traffic dataset[22]... Large-scale dataset[23]... T4C22 dataset [24]...
Dataset Splits	Yes	We fix the maximum number of epochs to 300 and we use early stopping on the validation set with patience of 40 epochs. On the large-scale dataset, ...We fix the maximum number of epochs to 200 and we use early stopping on the validation set with patience of 10 epochs.
Hardware Specification	Yes	On the traffic dataset, we train all models with an RTX 3090 GPU (24GB RAM). On the large-scale dataset and the T4C22 dataset, we train them with a V100 GPU (16GB RAM).
Software Dependencies	No	All the baselines have been implemented in Py Torch [26]. The paper mentions PyTorch but does not specify a version number or other software dependencies with their respective versions.
Experiment Setup	Yes	For spatio-temporal data imputation task, we select windows of length 24. On the traffic datasets, we strictly follow the settings of SPIN4. We fix the maximum number of epochs to 300 and we use early stopping on the validation set with patience of 40 epochs. On the large-scale dataset, considering memory capacity and computational efficiency, we reduced the number of hidden states to 16 for some baselines (i.e., SPIN-H, CSDI, and Pri STI). We fix the maximum number of epochs to 200 and we use early stopping on the validation set with patience of 10 epochs. ... For congestion classification task, we train all the models for 20 epochs. For travel time prediction task, we train all the models for 50 epochs. ... As the hidden states go from 16 to 256, the overall performance slowly increases and then fluctuates slightly. Therefore, we set the number of hidden states to 128. ... The approximate best level can be reached when the number of layers is only 2.