Deep Latent State Space Models for Time-Series Generation

Authors: Linqi Zhou, Michael Poli, Winnie Xu, Stefano Massaroli, Stefano Ermon

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our method across a variety of time series datasets, benchmarking LS4 against an extensive set of baselines. We propose a set of 3 metrics to measure the quality of generated time series samples and show that LS4 performs significantly better than baselines on datasets with stiff transitions and obtains on average 30% lower MSE scores and ELBO. On sequences with 20K lengths, our model trains 100 faster than the baseline methods.5. Experiments In this section, we verify the modeling capability of LS4 empirically.
Researcher Affiliation Academia 1Stanford University 2University of Toronto 3MILA. Correspondence to: Linqi Zhou <linqizhou@stanford.edu>, Michael Poli <poli@stanford.edu>.
Pseudocode Yes To demonstrate the computation efficiency, we additionally provide below pseudo-code for a single LS4 prior layer 8.
Open Source Code No The paper mentions reusing 'the code from official repo' for baselines, but does not provide an explicit statement or link for the open-sourcing of its own methodology's code.
Open Datasets Yes We use Monash Time Series Repository (Godahewa et al., 2021), a comprehensive benchmark containing 30 time-series datasets collected in the real world, and we choose FRED-MD, NN5 Daily, Temperature Rain, and Solar Weekly as our target datasets. We use USHCN and Physionet as our datasets of choice. The United States Historical Climatology Network (USHCN) (Menne et al., 2015) is a climate dataset... Physionet (Silva et al., 2012) is a dataset...
Dataset Splits No The datasets are split into 80% training data and 20% testing data.
Hardware Specification Yes We test all models on a single RTX A5000 GPU.
Software Dependencies No The paper mentions using 'Adam W optimizer' but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For all experiments we use Adam W optimizer with learning rate 0.001. We use batch size 64 and train for 7000 epochs for FRED-MD, NN5 Daily, and Solar Weekly, 1000 epochs for Temperature Rain, and 500 epochs for Physionet and USHCN. For all MONASH experiments, we use Adam W optimizer with learning rate 0.001 and no weight decay. For each of prior/generative/inference model, we use 4 stacks for each for loop in the pseudocode. For each LS4 block, we use 64 as the dimension of ht and 64 SSM channels in parallel, same as used in S4 and Sa Shi Mi. Each residual block consists of 2 linear layers with skip connection at the output level where the first linear layer has 2 times output size as the input size and the second layer squeezes it back to the input size of the residual block. We generally find 5-dimensional latent space gives better performance than 1, and so uses this setting throughout. We also employ EMA for model weights and use 0.999 as the lambda value, but we do not find this choice crucial. We also use 0.1 as the standard deviation for the observation as this gives better ELBO than other choices we experimented with such as 1, 0.5, 0.01.