Non-autoregressive Conditional Diffusion Models for Time Series Prediction

Authors: Lifeng Shen, James Kwok

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are performed on nine real-world datasets. Results show that Time Diff consistently outperforms existing time series diffusion models, and also achieves the best overall performance across a variety of the existing strong baselines (including transformers and Fi LM).
Researcher Affiliation Academia 1Division of Emerging Interdisciplinary Areas, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. 2Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong. Correspondence to: Lifeng Shen <lshenae@connect.ust.hk>, James T. Kwok <jamesk@cse.ust.hk>.
Pseudocode Yes Algorithm 1 Training. and Algorithm 2 Inference.
Open Source Code No The paper does not provide an explicit statement or a link to open-source code for the proposed Time Diff model. It only lists where the code for baselines was downloaded from in Appendix B.2.
Open Datasets Yes Experiments are performed on nine real-world time series datasets (Table 1) (Zhou et al., 2021; Wu et al., 2021; Fan et al., 2022): (i) Nor Pool1... (ii) Caiso2... (iii) Traffic3... (iv) Electricity4... (v) Weather5... (vi) Exchange(Lai et al., 2018)... (vii)-(viii) ETTh1 and ETTm1... (ix) Wind (Li et al., 2022b).
Dataset Splits Yes For the other datasets, we follow (Wu et al., 2021; Zhou et al., 2022b) and split the whole data dataset into training, validation, and test sets in chronological order with the ratio of 6:2:2 for ETTh1 and ETTm1, and 7:1:2 for Weather, Wind, Traffic, Electricity, and Exchange.
Hardware Specification Yes All experiments are run on a Nvidia RTX A6000 GPU with 40GB memory.
Software Dependencies No The paper mentions using Adam optimizer and DPM-Solver, but it does not provide specific version numbers for these or other key software components like Python, PyTorch, or CUDA, which are necessary for reproducible software dependency information.
Experiment Setup Yes We train the proposed model using Adam (Kingma & Ba, 2015) with a learning rate of 10 3. The batch size is 64, and training with early stopping for a maximum of 100 epochs. K = 100 diffusion steps are used, with a cosine variance schedule (Rasul et al., 2021) starting from β1 = 10 4 to βK = 10 1.