Generative Time Series Forecasting with Diffusion, Denoise, and Disentanglement

Authors: Yan Li, Xinjiang Lu, Yaqing Wang, Dejing Dou

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on synthetic and real-world data show that D3VAE outperforms competitive algorithms with remarkable margins. Our implementation is available at https://github.com/ Paddle Paddle/Paddle Spatial/tree/main/research/D3VAE. ... 3 Experiments ... Table 1: Performance comparisons on synthetic data in terms of MSE and CRPS. ... Table 2: The performance comparisons on real-world datasets in terms of MSE and CRPS...
Researcher Affiliation Collaboration Yan Li , Xinjiang Lu , Yaqing Wang , Dejing Dou Business Intelligence Lab, Baidu Research Zhejiang University, China ly21121@zju.edu.cn, {luxinjiang,wangyaqing01,doudejing}@baidu.com
Pseudocode Yes Algorithm 1 Training Procedure. Algorithm 2 Forecasting Procedure.
Open Source Code Yes Our implementation is available at https://github.com/ Paddle Paddle/Paddle Spatial/tree/main/research/D3VAE.
Open Datasets Yes Six real-world datasets with diverse spatiotemporal dynamics are selected, including Traffic [27], Electricity2, Weather3, Wind (Wind Power) 4, and ETTs [56] (ETTm1 and ETTh1). 2https://archive.ics.uci.edu/ml/datasets/Electricity Load Diagrams20112014 3https://www.bgc-jena.mpg.de/wetter/ 4This dataset is published at https://github.com/Paddle Paddle/Paddle Spatial/tree/main/ paddlespatial/datasets/Wind Power.
Dataset Splits Yes All datasets are split chronologically and adopt the same train/validation/test ratios, i.e., 7:1:2.
Hardware Specification Yes All the experiments were carried out on a Linux machine with a single NVIDIA P40 GPU.
Software Dependencies No The paper mentions 'Adam optimizer' and implies the use of 'PaddlePaddle' through a GitHub link, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We use the Adam optimizer with an initial learning rate of 5e 4. The batch size is 16, and the training is set to 20 epochs at most equipped with early stopping. The number of disentanglement factors is chosen from {4, 8}, and βt β is set to range from 0.01 to 0.1 with different diffusion steps T [100, 1000], then ω is set to 0.1. The trade-off hyperparameters are set as ψ = 0.05, λ = 0.1, γ = 0.001 for ETTs, and ψ = 0.5, λ = 1.0, γ = 0.01 for others.