SparseTSF: Modeling Long-term Time Series Forecasting with *1k* Parameters

Authors: Shengsheng Lin, Weiwei Lin, Wentai Wu, Haojun Chen, Junjie Yang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present the experimental results of Sparse TSF on mainstream LTSF benchmarks. Additionally, we discuss the efficiency advantages brought by the lightweight architecture of Sparse TSF. Furthermore, we conduct ablation studies and analysis to further reveal the effectiveness of the Sparse technique.
Researcher Affiliation Academia 1School of Computer Science and Engineering, South China University of Technology, Guangzhou 510006, China 2Peng Cheng Laboratory, Shenzhen 518066, China 3College of Information Science and Technology, Jinan University, Guangzhou 510632, China. Correspondence to: Weiwei Lin <linww@scut.edu.cn>.
Pseudocode Yes Algorithm 1 The Overall Pseudocode of Sparse TSF
Open Source Code Yes The code is publicly available at this repository: https://github.com/lss-1138/Sparse TSF.
Open Datasets Yes We conducted experiments on four mainstream LTSF datasets that exhibit daily periodicity. These datasets include ETTh1&ETTh21, Electricity2, and Traffic3. The details of these datasets are presented in Table 1. 1https://github.com/zhouhaoyi/ETDataset 2https://archive.ics.uci.edu/ml/datasets 3https://pems.dot.ca.gov/
Dataset Splits Yes The dataset splitting follows the procedures of FITS and Autoformer, where the ETT datasets are divided into proportions of 6:2:2, while the other datasets are split into proportions of 7:1:2.
Hardware Specification Yes All experiments in this study were implemented using PyTorch (Paszke et al., 2019) and conducted on a single NVIDIA RTX 4090 GPU with 24GB of memory.
Software Dependencies No We implemented Sparse TSF in PyTorch (Paszke et al., 2019) and trained it using the Adam optimizer (Kingma & Ba, 2014). While PyTorch is mentioned, a specific version number for the software dependency is not provided.
Experiment Setup Yes We implemented Sparse TSF in PyTorch (Paszke et al., 2019) and trained it using the Adam optimizer (Kingma & Ba, 2014) for 30 epochs, with a learning rate decay of 0.8 after the initial 3 epochs, and early stopping with a patience of 5. The period w is set to the inherent cycle of the data (e.g., w = 24 for ETTh1)... For datasets with fewer than 100 channels (such as ETTh1), the batch size is set to 256, while for datasets with fewer than 300 channels (such as Electricity), the batch size is set to 128. This setting maximizes the utilization of GPU parallel computing capabilities while avoiding GPU out-of-memory issues (i.e., with NVIDIA RTX 4090, 24GB). Additionally, the learning rate needs to be set relatively large (i.e., 0.02).