Parametric Augmentation for Time Series Contrastive Learning

Authors: Xu Zheng, Tianchun Wang, Wei Cheng, Aitian Ma, Haifeng Chen, Mo Sha, Dongsheng Luo

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on univariate forecasting tasks demonstrate the highly competitive results of our method, with an average 6.5% reduction in MSE and 4.7% in MAE over the leading baselines. In classification tasks, Auto TCL achieves a 1.2% increase in average accuracy. With comprehensive experimental studies, we empirically verify the advantage of the proposed method on benchmark time series forecasting datasets.
Researcher Affiliation Collaboration 1School of Computing and Information Sciences, Florida International University, US 2College Information Sciences and Technology, The Pennsylvania State University, US 3NEC Laboratories America, US
Pseudocode Yes Algorithm 1 Auto TCL training algorithm
Open Source Code Yes The source code is available at https://github.com/Aslan Ding/Auto TCL.
Open Datasets Yes Six benchmark datasets, ETTh1, ETTh2, ETTm1, (Zhou et al., 2021), Electricity (Dua & Graff, 2017), Weather2, and Lora dataset are adopted for time series forecasting... For the classification task, we evaluate our method on the UEA dataset (Dau et al., 2019), which contains 30 multivariate time series datasets.
Dataset Splits No The paper mentions training and testing but does not explicitly provide details about a validation dataset split or its specific size/percentage within the provided text.
Hardware Specification Yes All experiments are conducted on a Linux machine with 8 NVIDIA A100 GPUs, each with 40GB of memory. The software environment is CUDA 11.6 and Driver Version 520.61.05.
Software Dependencies Yes We used Python 3.9.13 and Pytorch 1.12.1 to construct our project.
Experiment Setup Yes Optimizer: Two Adam optimizers (Kingma & Ba, 2014) were used for the augmentation network and feature extraction network with learning rate and other hyperparameters were setting with default decay rates setting to 0.001 and (0.9,0.999) respectively. Encoder architecture: The depth of the multi-layer dilated CNN module and the hidden dimension were designed to be able to change, which were searched in {6, 7, 8, 9, 10} and {256, 128, 64, 32, 16, 8}. In training, we used a designed dropout rate to avoid overfitting, which was tuned in [0.01, 1].