Learning to Embed Time Series Patches Independently

Authors: Seunghan Lee, Taeyoung Park, Kibok Lee

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on various tasks, demonstrating that our proposed method outperforms the state-of-the-art (SOTA) performance in both forecasting and classification tasks, under both standard and transfer learning settings.
Researcher Affiliation Academia Seunghan Lee, Taeyoung Park, Kibok Lee Department of Statistics and Data Science, Yonsei University
Pseudocode No The paper describes the proposed method using descriptive text, mathematical equations, and diagrams (e.g., Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is available at this repository: https://github.com/seunghan96/pits.
Open Datasets Yes For forecasting tasks, we experiment seven datasets, including four ETT datasets (ETTh1, ETTh2, ETTm1, ETTm2), Weather, Traffic, and Electricity (Wu et al., 2021). [...] These datasets have been widely employed for benchmarking and are publicly accessible (Wu et al., 2021).
Dataset Splits Yes For all hyperparameter tuning, we utilize a separate validation dataset, following the standard protocol of splitting all datasets into training, validation, and test sets in chronological order with a ratio of 6:2:2 for the ETT datasets and 7:1:2 for the other datasets (Wu et al., 2021).
Hardware Specification No The paper does not explicitly describe the specific hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the implementation (e.g., Python, PyTorch, or other packages).
Experiment Setup Yes We conduct hyperparameter search for three key parameters using the predefined validation dataset: the hidden dimension of the MLP (D {32, 64, 128}), patch size (P {12, 18, 24}), and input horizon (L 336, 512, 768). For self-supervised learning, we utilize a shared pretrained weight for all prediction horizons, making it more efficient compared to supervised learning in the long term. In both self-supervised pretraining and supervised learning, we utilize an epoch size of 100. During fine-tuning in self-supervised learning, we apply linear probing for either 10 or 20 epochs, depending on the dataset, to update the model head. Subsequently, we perform end-to-end fine-tuning of the entire network for twice the epoch duration of linear probing, following the approach outlined in Patch TST (Nie et al., 2023). The dropout ratio for the fully connected layer preceding the prediction head is set to 0.2.