Scaling Law for Time Series Forecasting

Authors: Jingzhe Shi, Qinwei Ma, Huan Ma, Lei Li

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We propose a theory for scaling law for time series forecasting that can explain these seemingly abnormal behaviors. Furthermore, we empirically evaluate various models using a diverse set of time series forecasting datasets, which (1) verifies the validity of scaling law on dataset size and model complexity within the realm of time series forecasting, and (2) validates our theoretical framework, particularly regarding the influence of look back horizon.
Researcher Affiliation Academia Jingzhe Shi1 , Qinwei Ma1 , Huan Ma2, Lei Li3,4 B 1 Institute for Interdisciplinary Information Sciences, Tsinghua University 2 Zhili College, Tsinghua University 3 University of Copenhagen 4 University of Washington
Pseudocode No No pseudocode or algorithm blocks are present in the paper.
Open Source Code Yes Code for our experiments has been made public at: https://github.com/Jingzhe Shi/ Scaling Law For Time Series Forecasting.
Open Datasets Yes We conduct experiments on 8 datasets listed here. ETTh1, ETTh2, ETTm1, ETTm2 [4] contains 7 factors of electricity transformer from July 2016 July 2018 at a certain frequency. Exchange [5] includes panel data for daily exchange rates from 8 countries across 27 years. Weather [5] includes 21 meteorological factors from a Weather Station. ECL [5] records electricity consumption of 321 clients. Traffic [5] records hourly road occupancy rates for 2 years in San Francisco Bay area freeways.
Dataset Splits Yes Dataset Dim Pred Len Dataset Size Frequency Information ETTh1 7 192 (8545, 2881, 2881) Hourly Electricity ... For all experiments, the pred-len is set to 192.
Hardware Specification Yes All the deep learning networks are implemented in Py Torch[40] and conducted on a cluster with NVIDIA RTX 3080, RTX 3090, RTX 4090D and A100 40GB GPUs.
Software Dependencies No The paper states 'All the deep learning networks are implemented in Py Torch[40]' but does not specify the version number of PyTorch or other software dependencies.
Experiment Setup Yes Batch size are chosen from {1024, 2048, 4096, 8192, 16384}, learning rate is chosen from {0.003, 0.001} and weight decay is chosen from {0.0005, 0.005, 0.001, 0.0001}, and we decay the learning rate with a factor of {0.96, 0.97, 0.98} for at most 100 epochs with patience at most 30.