Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

Authors: Yuan Yuan, Chenyang Shao, Jingtao Ding, Depeng Jin, Yong Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on multiple real-world scenarios demonstrate that GPD achieves superior performance towards data-scarce scenarios with an average improvement of 7.87% over the best baseline on four datasets.
Researcher Affiliation Academia Department of Electronic Engineering, BNRist, Tsinghua University, Beijing, China
Pseudocode Yes Algorithm 1 in Appendix A.5 illustrates the training process. Then we transform the parameters of each model into a vector-based format. ... Algorithm 2 illustrates the pre-training procedure. ... Algorithm 3 illustrates the procedure.
Open Source Code Yes The implementation of our approach is available: https://github.com/tsinghua-fib-lab/GPD.
Open Datasets Yes We conduct experiments on two types of spatio-temporal prediction tasks: crowd flow prediction and traffic speed prediction. As for crowd flow prediction, we conducted experiments on three real-world datasets, including New York City, Washington, D.C., and Baltimore. ... As for traffic speed prediction, we conduct experiments on four real-world datasets, including Meta LA, PEMS-BAy, Didi-Chengdu, and Didi-Shenzhen. ... We conducted our performance evaluation using four real-world traffic speed datasets, following the data preprocessing procedures established in prior literature (Li et al., 2018; Lu et al., 2022).
Dataset Splits No For both tasks, we categorized the datasets into source cities and one target city. For example, if one specific city is set as the target dataset, we assume access to only a limited quantity of data, such as three-day data (existing models usually require several months of data to train the model). The paper does not explicitly state training/validation/test splits with percentages or counts for reproducibility.
Hardware Specification Yes Our framework can be effectively trained within 3 hours and all experiments were completed on one NVIDIA Ge Force RTX 4090.
Software Dependencies No The paper does not provide specific version numbers for software dependencies such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes In the experiments, we set the number of diffusion steps N=500. The learning rate is set to 8e-5 and the number of training epochs ranges from 3000 to 12000. The dimensions of KG embedding and time embedding are both 128. Regarding the spatio-temporal prediction, we use 12 historical time steps to predict 6 future time steps.