reproducibilityindex.ai

ForecastPFN: Synthetically-Trained Zero-Shot Forecasting

Authors: Samuel Dooley, Gurnoor Singh Khurana, Chirag Mohapatra, Siddartha V Naidu, Colin White

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we show that zero-shot predictions made by Forecast PFN are more accurate and faster compared to state-of-the-art forecasting methods, even when the other methods are allowed to train on hundreds of additional in-distribution data points.
Researcher Affiliation	Collaboration	1 Abacus.AI, 2 Caltech
Pseudocode	No	The paper describes the model architecture and training procedure in detail, but it does not include any explicit pseudocode blocks or algorithms labeled as such.
Open Source Code	Yes	Our codebase and our model are available at https://github.com/abacusai/forecastpfn.
Open Datasets	Yes	To ensure a fair comparison, we evaluate on seven popular, real-world datasets across energy systems, economics, traffic, and weather: ECL (Electricity Consuming Load) [46], ETT 1 and 2 (Electricity Transformer Temperature) [52], Exchange [28], Illness [17], Traffic [39], and Weather [1].
Dataset Splits	Yes	All non-zero-shot methods are allowed to train on {(t, yt)}500 t=500 x. Then, at test time, all algorithms see the 36 input data points and make a prediction length of ℓ, e.g., input of {(t, yt)}536 t=501 and make predictions for timesteps t = 537 to 537 + ℓ. We allow algorithms to use 10% of their data budget on validation.
Hardware Specification	Yes	Each epoch consists of 1 024 000 tasks, and we trained the transformer for 600 epochs with the Adam optimizer [25] on a single Tesla V100 16GB GPU, which took 30 hours.
Software Dependencies	No	The paper mentions using 'Adam optimizer' and 'pmdarima [43]' for ARIMA, and 'official codebase' for other methods, but it does not specify version numbers for Python, PyTorch, or any other libraries/frameworks.
Experiment Setup	Yes	We set the input length ℓ= 100 and the maximum prediction of 10 steps into the future. Each epoch consists of 1 024 000 tasks, and we trained the transformer for 600 epochs with the Adam optimizer [25] on a single Tesla V100 16GB GPU, which took 30 hours. We use the Adam optimizer with a learning rate of 0.0001, and MSE loss.