reproducibilityindex.ai

TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting

Authors: Shiyu Wang, Haixu Wu, Xiaoming Shi, Tengge Hu, Huakun Luo, Lintao Ma, James Y. Zhang, JUN ZHOU

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments to evaluate the performance and efficiency of Time Mixer, covering long-term and short-term forecasting, including 18 real-world benchmarks and 15 baselines.
Researcher Affiliation	Collaboration	1Ant Group, Hangzhou, China 2Tsinghua University, Beijing, China {weiming.wsy,lintao.mlt,peter.sxm,james.z,jun.zhoujun}@antgroup.com, {wuhx23,htg21,luohk19}@mails.tsinghua.edu.cn
Pseudocode	No	The paper describes the architecture and operations of Time Mixer's components (PDM, FMM) using mathematical formulations (e.g., Equation 1, 2, 3) and descriptive text. However, it does not include any explicitly labeled pseudocode blocks or algorithms.
Open Source Code	Yes	The source code is provided in supplementary materials and public in Git Hub (https://github.com/kwuking/Time Mixer) for reproducibility.
Open Datasets	Yes	Benchmarks For long-term forecasting, we experiment on 8 well-established benchmarks: ETT datasets (including 4 subsets: ETTh1, ETTh2, ETTm1, ETTm2), Weather, Solar-Energy, Electricity, and Traffic following (Zhou et al., 2021; Wu et al., 2021; Liu et al., 2022a). For short-term forecasting, we adopt the Pe MS (Chen et al., 2001) which contains four public traffic network datasets (PEMS03, PEMS04, PEMS07, PEMS08), and M4 dataset which involves 100,000 different time series collected in different frequencies.
Dataset Splits	Yes	Table 6: Dataset detailed descriptions. The dataset size is organized in (Train, Validation, Test).
Hardware Specification	Yes	All the experiments are implemented in Py Torch (Paszke et al., 2019) and conducted on a single NVIDIA A100 80GB GPU.
Software Dependencies	No	The paper states that experiments are "implemented in Py Torch (Paszke et al., 2019)" and that the "ADAM optimizer (Kingma & Ba, 2015)" was used. However, it does not provide specific version numbers for PyTorch or any other relevant libraries or packages.
Experiment Setup	Yes	We set the initial learning rate as 10 2 or 10 3 and used the ADAM optimizer (Kingma & Ba, 2015) with L2 loss for model optimization. And the batch size was set to be 8 between 128. By default, Time Mixer contains 2 Past Decomposable Mixing blocks. We choose the number of scales M according to the length of the time series to achieve a balance between performance and efficiency. To handle longer series in long-term forecasting, we set M to 3. As for short-term forecasting with limited series length, we set M to 1. Detailed model configuration information is presented in Table 7.