reproducibilityindex.ai

Domain Adaptation for Time Series Forecasting via Attention Sharing

Authors: Xiaoyong Jin, Youngsuk Park, Danielle Maddix, Hao Wang, Yuyang Wang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various domains demonstrate that our proposed method outperforms state-of-the-art baselines on synthetic and real-world datasets, and ablation studies verify the effectiveness of our design choices.
Researcher Affiliation	Collaboration	1Department of Computer Science, University of California Santa Barbara, California, USA (work done during internship at Amazon AWS AI) 2Amazon AWS AI 3Rutgers University.
Pseudocode	Yes	Algorithm 1 Adversarial Training of DAF
Open Source Code	No	The paper mentions using 'publicly available version on Sagemaker' for DAR and implementing models using PyTorch, but does not provide concrete access (link or explicit statement) to the source code for the DAF methodology itself or their specific implementations.
Open Datasets	Yes	We perform experiments on four real benchmark datasets that are widely used in forecasting literature: elec and traf from the UCI data repository (Dua & Graff, 2017), sales (Kar, 2019) and wiki (Lai, 2017) from Kaggle.
Dataset Splits	Yes	We partition the target datasets equally into training/validation/test splits, i.e. 10/10/10 days for hourly datasets and 20/20/20 days for daily datasets.
Hardware Specification	No	The paper states that models were trained 'on AWS Sagemaker' but does not provide specific hardware details such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper states, 'We implement the models using Py Torch (Paszke et al., 2019),' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	The following hyperparameters of DAF and baseline models are selected by grid-search over the validation set: the hidden dimension h {32, 64, 128, 256} of all models; the number of MLP layers l MLP {4} for N-BEATS 2, l MLP {1, 2, 3} for Att F, DAF and its variants; the number of RNN layers l RNN {1, 3} in DAR and RDA; the kernel sizes of convolutions s {3, 13, (3, 5), (3, 17)} in Att F, DAF and its variants; 3 the learning rate γ {0.001, 0.01, 0.1} for all models; the trade-off coefficient λ {0.1, 1, 10} in equation (2) for DAF, RDA-ADDA;