ARM: Refining Multivariate Forecasting with Adaptive Temporal-Contextual Learning
Authors: Jiecheng Lu, Xu Han, Shihao Yang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have conducted extensive experiments on 9 widely-used datasets, including the ETT(Zhou et al., 2021), Traffic, Electricity, Weather, ILI, and Exchange Rate(Lai et al., 2018) datasets (details in A.1). Our baseline comparison included Transformer-based models such as Patch TST(Nie et al., 2022), FEDformer(Zhou et al., 2022), Autoformer(Wu et al., 2021), and Informer(Zhou et al., 2021), and a linear model DLinear from (Zeng et al., 2023). The performance was evaluated using mean squared error (MSE) and mean absolute error (MAE). The results demonstrate that ARM consistently outperforms previous models across 10 benchmarks. |
| Researcher Affiliation | Collaboration | Jiecheng Lu1, Xu Han2, Shihao Yang1 1Georgia Institute of Technology 2Amazon Web Services jlu414@gatech.edu, icyxu@amazon.com, shihao.yang@isye.gatech.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks with clear labels like 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. It mentions using existing models like DLinear and Patch TST, but does not state that *their* code (ARM) is available or link to a repository for it. |
| Open Datasets | Yes | The ETT dataset(Zhou et al., 2021)1 is a collection of load and oil temperature data from electricity transformers, captured at 15-minute intervals between July 2016 and July 2018. ... 1https://github.com/zhouhaoyi/ETDataset. The Electricity dataset2 includes hourly electricity consumption data for 321 clients from 2012 to 2014. ... 2https://archive.ics.uci.edu/ml/datasets/Electricity Load Diagrams20112014. The Exchange dataset(Lai et al., 2018)3 is a collection of daily foreign exchange rates for eight countries... 3https://github.com/laiguokun/multivariate-time-series-data. The Traffic dataset4 is a collection of hourly road occupancy rates... 4http://pems.dot.ca.gov/. The Weather dataset5 comprises 21 meteorological indicators... 5https://www.bgc-jena.mpg.de/wetter/. The ILI dataset6 is a collection of weekly data... 6https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html. |
| Dataset Splits | Yes | For model selection, we partition each dataset into training, validation, and test sets with proportions of 70%, 10%, and 20%, respectively. The models are trained on the training set, and the best model is selected based on its MSE on the validation set. The MSE and MAE of this model on the test set are reported. |
| Hardware Specification | Yes | We run our model on a single Nvidia RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions 'Pytorch' but does not specify its version number. It also mentions 'ptflops package' but without a version. |
| Experiment Setup | Yes | The ARM (Vanilla) model is trained using the Adam optimizer and MSE loss in Pytorch, with a learning rate of 0.00005 over 100 epochs on each dataset with a early-stopping patience being 30 steps. The first 10% of epochs are for warm-up, followed by a linear decay of learning rate. ... The multi-kernel size s of MKLS is set to [25 145 385]... We use a batch size of 32 for most datasets. ... We employ a Transformer model dimension d = 16 for most of the small datasets (C < 20)... For the datasets like Weather, Electricity, and Traffic (C >= 20)... we raise the dimension d to 64. ... we set it to 8. For the number of layers in the Transformer encoder-decoder structure, we use two encoder layers and one decoder layer. We do not apply dropout in the Transformer Encoder; in the MKLS, we set the dropout rate to 0.25; and in the Mo E, we set the dropout rate to 0.75. |