Retrieval-Augmented Diffusion Models for Time Series Forecasting

Authors: Jingwei Liu, Ling Yang, Hongyan Li, Shenda Hong

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments and visualizations on multiple datasets demonstrate the effectiveness of our approach, particularly in complicated prediction tasks. Our approach demonstrates strong performance across multiple datasets, particularly on more complex tasks. We conducted experiments on five real-world datasets and provided a comprehensive presentation and analysis of the results using multiple metrics.
Researcher Affiliation Academia 1School of Intelligence Science and Technology, Peking University 2 National Key Laboratory of General Artificial Intelligence, Peking University 3Institute of Medical Technology, Peking University Health Science Center 4 National Institute of Health Data Science, Peking University 5 Institute for Artificial Intelligence, Peking University
Pseudocode Yes Algorithm 1 Training Procedure of RATD; Algorithm 2 Sampling Procedure of RATD
Open Source Code Yes Our code is available at https://github.com/stanliu96/RATD
Open Datasets Yes experiments are performed on four popular real-world time series datasets: (1) Electricity*, which includes the hourly electricity consumption data from 321 clients over two years.; (2) Wind [20], which contains wind power records from 2020-2021. (3) Exchange [18], which describes the daily exchange rates of eight countries (Australia, British, Canada, Switzerland, China, Japan, New Zealand, and Singapore); (4) Weather , which documents 21 meteorological indicators at 10-minute intervals spanning from 2020 to 2021.; Besides, we also applied our method to a large ECG time series dataset: MIMIC-IV-ECG [14].
Dataset Splits Yes Our dataset is split in the proportion of 7:1:2 (Train: Validation: Test), utilizing a random splitting strategy to ensure diversity in the training set.
Hardware Specification Yes All experiments were conducted on an Nvidia RTX A6000 GPU with 40GB memory.
Software Dependencies Yes a 1-layer Transformer encoder implemented in Py Torch [27]
Experiment Setup Yes The length of the historical time series was 168, and the prediction lengths were (96, 192, 336), with results averaged. For training, we utilized the Adam optimizer with an initial learning rate of 10 3, betas = (0.95, 0.999). During the training process of shifted diffusion, the batch size was set to 64, and early stopping was applied for a maximum of 200 epochs. The diffusion steps T were set to 100.