ForecastPFN: Synthetically-Trained Zero-Shot Forecasting
Authors: Samuel Dooley, Gurnoor Singh Khurana, Chirag Mohapatra, Siddartha V Naidu, Colin White
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we show that zero-shot predictions made by Forecast PFN are more accurate and faster compared to state-of-the-art forecasting methods, even when the other methods are allowed to train on hundreds of additional in-distribution data points. |
| Researcher Affiliation | Collaboration | 1 Abacus.AI, 2 Caltech |
| Pseudocode | No | The paper describes the model architecture and training procedure in detail, but it does not include any explicit pseudocode blocks or algorithms labeled as such. |
| Open Source Code | Yes | Our codebase and our model are available at https://github.com/abacusai/forecastpfn. |
| Open Datasets | Yes | To ensure a fair comparison, we evaluate on seven popular, real-world datasets across energy systems, economics, traffic, and weather: ECL (Electricity Consuming Load) [46], ETT 1 and 2 (Electricity Transformer Temperature) [52], Exchange [28], Illness [17], Traffic [39], and Weather [1]. |
| Dataset Splits | Yes | All non-zero-shot methods are allowed to train on {(t, yt)}500 t=500 x. Then, at test time, all algorithms see the 36 input data points and make a prediction length of ℓ, e.g., input of {(t, yt)}536 t=501 and make predictions for timesteps t = 537 to 537 + ℓ. We allow algorithms to use 10% of their data budget on validation. |
| Hardware Specification | Yes | Each epoch consists of 1 024 000 tasks, and we trained the transformer for 600 epochs with the Adam optimizer [25] on a single Tesla V100 16GB GPU, which took 30 hours. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' and 'pmdarima [43]' for ARIMA, and 'official codebase' for other methods, but it does not specify version numbers for Python, PyTorch, or any other libraries/frameworks. |
| Experiment Setup | Yes | We set the input length ℓ= 100 and the maximum prediction of 10 steps into the future. Each epoch consists of 1 024 000 tasks, and we trained the transformer for 600 epochs with the Adam optimizer [25] on a single Tesla V100 16GB GPU, which took 30 hours. We use the Adam optimizer with a learning rate of 0.0001, and MSE loss. |