ContiFormer: Continuous-Time Transformer for Irregular Time Series Modeling
Authors: Yuqi Chen, Kan Ren, Yansen Wang, Yuchen Fang, Weiwei Sun, Dongsheng Li
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | A wide range of experiments on both synthetic and real-world datasets have illustrated the superior modeling capacities and prediction performance of Conti Former on irregular time series data. In this section, we evaluate Conti Former on three types of tasks on irregular time series data, i.e., interpolation and extrapolation, classification, event prediction, and forecasting. |
| Researcher Affiliation | Collaboration | Yuqi Chen1,2 , Kan Ren2 , Yansen Wang2, Yuchen Fang2,3, Weiwei Sun1, Dongsheng Li2 1 School of Computer Science & Shanghai Key Laboratory of Data Science, Fudan University 2 Microsoft Research Asia, 3 Shanghai Jiao Tong University |
| Pseudocode | Yes | The generation process is shown in Algorithm 1. |
| Open Source Code | Yes | The project link is https://seqml.github.io/contiformer/. |
| Open Datasets | Yes | We select 20 datasets from UEA Time Series Classification Archive [3] with diverse characteristics... We use one synthetic dataset and five real-world datasets, namely Synthetic, Neonate [50], Traffic [32], MIMIC [16], Book Order [16] and Stack Overflow [33] to evaluate our model. |
| Dataset Splits | Yes | We generate 300 spirals and 200/100 spirals are used for training/testing respectively. We use the 4-fold cross-validation scheme for Synthetic, Neonate, and Traffic datasets following [17], and the 5-fold cross-validation scheme for the other three datasets following [39, 64]. |
| Hardware Specification | Yes | All the experiments were carried out on a single 16GB NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python, PyTorch) are provided, only the algorithm name 'Runge-Kutta-4 [44] (RK4) algorithm'. |
| Experiment Setup | Yes | By default, we use the natural cubic spline to construct the continuous-time query function. The vector field in ODE is defined as f(t, x) = Actfn(LN(Lineard,d(Lineard,d(x) + Linear1,d(t)))), where Actfn( ) is either tanh or sigmoid activation function, Lineara,b( ) : Ra Rb is a linear transformation from dimension a to dimension b, LN denotes the layer normalization. We adopt the Gauss-Legendre Quadrature approximation to implement Eq. (9). In the experiment, we choose the Runge-Kutta-4 [44] (RK4) algorithm to solve the ODE with a fixed step of fourth order and a step size of 0.1. For both our model and the baseline models, we adopted a fixed learning rate of 10-2 and a batch size of 64. The training process for all models lasted for 1000 epochs. |