An Analysis of Linear Time Series Forecasting Models
Authors: William Toner, Luke Nicholas Darlow
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide experimental evidence that the models under inspection learn nearly identical solutions, and finally demonstrate that the simpler closed form solutions are superior forecasters across 72% of test settings. |
| Researcher Affiliation | Collaboration | 1ANC, Department of Informatics, University of Edinburgh, Edinburgh 2Systems Infrastructure Research, Huawei Research Centre, Edinburgh. Correspondence to: William Toner <w.j.toner@sms.ed.ac.uk>, Luke Darlow <luke.darlow1@huawei.com>. |
| Pseudocode | No | The paper provides mathematical proofs and descriptions of models but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | To ensure reproducibility, the code to fit and evaluate OLS solutions in this paper can be found here: github.com/sir-lab/linear-forecasting. |
| Open Datasets | Yes | For our experiments in Section 5.2 we use 8 standard time series benchmarking datasets: ETTh1 and ETTh2: 7-channel hourly datasets... We refer the reader to (Wu et al., 2021) for further details. |
| Dataset Splits | Yes | ETTh1 and ETTh2: 7-channel hourly datasets (Train-Val-Test Splits [8545,2881,2881]). Their per-minute equivalents; ETTm1, ETTm2 (also 7-channel) (Train-Val-Test Splits [34465,11521,11521]). ECL, an hourly 321-channel Electricity dataset (Train-Val Test Splits [18317,2633,5261]), Weather, a per-10-minute resolution 21-channel weather dataset (Train-Val-Test Splits [36792,5271,10540]), Traffic; an 862-channel traffic dataset (Train-Val-Test Splits [12185,1757,3509]) and Exchange: a small 8-channel finance dataset (Train-Val-Test Splits [5120,665,1422]). |
| Hardware Specification | Yes | on an NVIDIA Ge Force RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' and 'scikit-learn' (for Linear Regression model and Ridge Regression function), and references implementations from other authors, but it does not specify exact version numbers for any of these software components or libraries. |
| Experiment Setup | Yes | For each model, dataset, and horizon combination we train for 50 epochs using a learning rate of 0.0005 and the Adam optimizer with the default hyperparameter settings. We use a batch size of 128 in all experiments. We track the validation loss during training. At test time we load the model with minimal validation loss to evaluate on the training set, which is equivalent to early stopping. [...] In all cases we use a context length of 720. |