SCAT: A Time Series Forecasting with Spectral Central Alternating Transformers
Authors: Chengjie Zhou, Chao Che, Pengfei Wang, Qiang Zhang
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on ten real-world datasets, encompassing Wind, Electricity, Weather, and others, demonstrate that our Spectral Central Alternating Transformer (SCAT) outperforms state-of-the-art methods (SOTA) by an average of 17.5% in power time series forecasting. |
| Researcher Affiliation | Academia | 1School of Computer Science and Technology, Dalian University of Technology 2Key Laboratory of Advanced Design and Intelligent Computing (Dalian University), Ministry of Education, Dalian University |
| Pseudocode | No | The paper includes architectural diagrams and mathematical formulas but does not provide a clearly labeled pseudocode block or algorithm. |
| Open Source Code | No | The paper does not provide any explicit statement about open-sourcing the code for the methodology or a link to a code repository. |
| Open Datasets | No | The paper names several datasets (Windm1-5, ETTm1-2, Weather, Traffic, Electricity) and describes their characteristics in Table 2 and Section 4.1. However, it does not provide specific links, DOIs, repository names, or formal citations with author names and year directly to access these datasets. |
| Dataset Splits | Yes | Table 2: The details of datasets are presented in Table 2. Dataset Size respectively denotes the total number of time points in (Train, Validation, and Test) split. E.g., Windm1: (23740, 3391, 6783). |
| Hardware Specification | Yes | The experimental framework used Pytorch and the algorithm ran on RTX 3080 GPU. |
| Software Dependencies | No | The paper mentions using 'Pytorch' as the experimental framework but does not specify its version or any other software dependencies with version numbers. |
| Experiment Setup | Yes | The SCAT architecture includes two encoder layers and one decoder layer. We set the cluster center dimension to k, fixed the model dimension at 512, and used eight attention components. Additionally, we applied a dropout rate of 0.1. The training batch size was 32. We utilized Adam with an initial learning rate in {10 3, 5 10 4, 10 4, 5 10 5}. |