One Fits All: Power General Time Series Analysis by Pretrained LM
Authors: Tian Zhou, Peisong Niu, xue wang, Liang Sun, Rong Jin
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This model, known as the Frozen Pretrained Transformer (FPT), is evaluated through fine-tuning on all major types of tasks involving time series. Our results demonstrate that pre-trained models on natural language or images can lead to a comparable or state-of-the-art performance in all main time series analysis tasks, as illustrated in Figure 1. Besides extensive empirical studies, we also investigate why a transformer model pre-trained from the language domain can be adapted to time series analysis with almost no change. |
| Researcher Affiliation | Industry | Tian Zhou Peisong Niu Xue Wang Liang Sun Rong Jin {tian.zt,niupeisong.nps,xue.w,liang.sun,jinrong.jr}@alibaba-inc.com |
| Pseudocode | No | The paper describes the model architecture in text and diagrams (Figure 2) but does not include explicit pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | The code is publicly available at https://github.com/DAMO-DI-ML/One_Fits_All. |
| Open Datasets | Yes | We conduct experiments on six popular real-world datasets, including 4 ETT datasets Zhou et al. (2021) (ETTh1, ETTh2, ETTm1, ETTm2), Electricity and Weather, where the data-missing is common. For classification, 10 multivariate UEA classification datasets Bagnall et al. (2018) are selected for evaluation. We compare models on five commonly used datasets, including SMDSu et al. (2019), MSLHundman et al. (2018), SMAPHundman et al. (2018), SWa TMathur & Tippenhauer (2016) and PSMAbdulaal et al. (2021). |
| Dataset Splits | Yes | Similar to traditional experimental settings, each time series is split into three parts: training data, validation data, and test data. For few-shot learning, only a certain percentage (10%, 5%) timesteps of training data are used. |
| Hardware Specification | Yes | We assessed the computational cost using a batch from ETTh2 (with a batch size of 128) on a 32G V100 GPU. |
| Software Dependencies | No | All the deep learning networks are implemented in Py Torch and trained on NVIDIA V100 32GB GPUs. We use the pre-trained models from Wolf et al. (2020) for experiments. While Py Torch is mentioned, no specific version number is provided for PyTorch or other libraries to ensure reproducibility. |
| Experiment Setup | Yes | To ensure a fair comparison, we use GPT2-backbone FPT and adhere to the experimental settings of Times Net Wu et al. (2023). For few-shot learning, an early stopping counter is employed to stop the training process after three epochs if no loss degradation on the valid set is observed. To better balance performance and computational efficiency, we test using various numbers of layers on ETTh2. |