reproducibilityindex.ai

Segment, Shuffle, and Stitch: A Simple Layer for Improving Time-Series Representations

Authors: Shivam Grover, Amin Jalali, Ali Etemad

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments on several datasets and state-of-the-art baselines, we show that incorporating S3 results in significant improvements for the tasks of time-series classification, forecasting, and anomaly detection, improving performance on certain datasets by up to 68%. We also show that S3 makes the learning more stable with a smoother training loss curve and loss landscape compared to the original baseline.
Researcher Affiliation	Academia	Shivam Grover Amin Jalali Ali Etemad Queen s University, Canada {shivam.grover, amin.jalali, ali.etemad}@queensu.ca
Pseudocode	No	The paper describes the Segment, Shuffle, and Stitch steps in detail with mathematical formulas and explanations, but it does not present them within a formally labeled "Pseudocode" or "Algorithm" block.
Open Source Code	Yes	The code is available at https://github.com/shivam-grover/S3-Time Series.
Open Datasets	Yes	For classification we use the following datasets: (1) The UCR archive [44] which consists of 128 univariate datasets, (2) the UEA archive [45] which consists of 30 multivariate datasets, and (3) three multivariate datasets namely EEG, EEG2, and HAR from the UCI archive [46]. For our experiments with pre-trained foundation model in Section 5, we also use the PTB-XL [47] dataset.
Dataset Splits	Yes	For classification, we measure the improvement as the percentage difference (Diff.) in accuracy resulting from S3, calculated as (Acc Baseline+S3 Acc Baseline)/Acc Baseline. For forecasting, since lower MSE is better, we use (MSEBaseline MSEBaseline+S3)/MSEBaseline. A similar equation is used for measuring the percentage difference in MAE. The train/test splits for all classification datasets are as provided in the original papers.
Hardware Specification	Yes	Our experiments are conducted on a single NVIDIA Quadro RTX 6000 GPU.
Software Dependencies	No	Our code is implemented with Py Torch, and our experiments are conducted on a single NVIDIA Quadro RTX 6000 GPU. We release the code at: https://github.com/shivam-grover/S3Time Series. The paper mentions Py Torch but does not specify a version number or other software dependencies with their versions.
Experiment Setup	Yes	All implementation details match those of the baselines. We used the exact hyperparameters of the baselines according to the original papers when they were specified in the papers or when the code was available; alternatively, when the hyperparameters were not exactly specified in the paper or in the code, we tried to maximize performance with our own search for the optimum hyperparameters. Additionally, for experiments involving Patch TST and La ST with the Electricity dataset, we use a batch size of 8 due to memory constraints. Accordingly, some baseline results may slightly differ from those available in the original papers. Note that deviations in baseline results affect both the baseline model and baseline+S3. For the weighted sum operation in S3, we use Conv1D. Our code is implemented with Py Torch, and our experiments are conducted on a single NVIDIA Quadro RTX 6000 GPU. We release the code at: https://github.com/shivam-grover/S3Time Series. Table 6 also lists the range of hyperparameters (n, phi, theta, lambda) used.