reproducibilityindex.ai

FIDE: Frequency-Inflated Conditional Diffusion Model for Extreme-Aware Time Series Generation

Authors: Asadullah Hill Galib, Pang-Ning Tan, Lifeng Luo

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on real-world and synthetic data showcase the efficacy of FIDE over baseline methods, highlighting its potential in advancing Generative AI for time series analysis, specifically in accurately modeling extreme events.
Researcher Affiliation	Academia	Asadullah Hill Galib, Pang-Ning Tan, and Lifeng Luo Michigan State University Emails: {galibasa, ptan, lluo}@msu.edu
Pseudocode	Yes	Algorithm 1 Training ... Algorithm 2 Sampling
Open Source Code	Yes	All the code and datasets used in this paper are available at https://github.com/galib19/FIDE.
Open Datasets	Yes	We partitioned each dataset into training, validation, and testing, according to a 8:1:1 ratio. ... The datasets used are described in Appendix D. ... (1) Synthetic Data (AR2): AR(2) dataset comprises synthetic time series data generated using an autoregressive model of order 2. (2) Financial Data (Stocks): It features continuous-valued and aperiodic sequences, such as daily historical Google stocks data spanning from 2004 to 2019. We consider the adjusted closing price data for this work. (3) Energy Data (Appliance Energy): The UCI Appliances energy prediction dataset [3] encompasses multivariate, continuous-valued measurements. We consider appliance energy data for analysis. (4) Weather/Climate Data (Daily Minimum Temperature): This dataset [17] comprises daily minimum temperatures in Melbourne, Australia, from 1981 to 1990. (5) Medical Data (ECG5000: Congestive Heart Failure): The original dataset [9] for "ECG5000" originates from a 20-hour long electrocardiogram (ECG) obtained from the Physionet database. Specifically, it is derived from the BIDMC Congestive Heart Failure Database (chfdb), with the record labeled as "chf07." The processed data encompasses 5,000 heartbeats randomly selected from the original dataset.
Dataset Splits	Yes	We partitioned each dataset into training, validation, and testing, according to a 8:1:1 ratio.
Hardware Specification	Yes	All experiments were conducted on NVIDIA T4 GPU.
Software Dependencies	No	The paper mentions using the Adam optimizer and Ray Tune framework with ASHA scheduler, but does not provide specific version numbers for any software libraries or frameworks (e.g., PyTorch, TensorFlow, etc.).
Experiment Setup	Yes	The encoder component of our framework employs a 3-layer transformer architecture, accompanied by fully connected layers. The training was facilitated using the Adam optimizer. For all the methods, we perform extensive hyperparameter tuning on the length of the embedding vector, the number of hidden layers, the number of nodes, the learning rate, and the batch size. The optimal hyperparameters were determined using the Ray Tune framework, integrating an Asynchronous Successive Halving Algorithm (ASHA) scheduler to enable early stopping.