Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SGN: Shifted Window-Based Hierarchical Variable Grouping for Multivariate Time Series Classification

Authors: Zenan Ying, Zhi Zheng, huijun hou, Tong Xu, Qi Liu, Jinke Wang, Wei Chen

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on diverse benchmark datasets from multiple domains demonstrate that SGN consistently achieves state-of-the-art performance, with an average improvement of 4.2% over existing methods.
Researcher Affiliation	Collaboration	Zenan Ying1, Jinke Wang1, Zhi Zheng1 , Tong Xu1, Wei Chen1, Qi Liu1, Huijun Hou2 1State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China 2Nio EMAIL EMAIL EMAIL
Pseudocode	No	The paper describes the methodology in prose and mathematical formulations, but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	We release the source code at https://github.com/colison/SGN.
Open Datasets	Yes	To evaluate the effectiveness of the proposed Swin Group Net model, we conduct extensive experiments on a diverse set of multivariate time series datasets. The detailed main dataset information is provided in Table 1. Furthermore, to comprehensively assess the generalization ability of our model, we additionally select 10 multivariate datasets from the UEA Time Series Classification Archive, which was introduced by Bagnall [47].
Dataset Splits	Yes	In our main experimental setup, the training, validation, and testing sets are partitioned either by subject or according to a fixed ratio, depending on the dataset characteristics. Samples from each subject are assigned to the respective sets following a fixed allocation ratio. Importantly, samples from the same subject are restricted to a single subset to avoid any data leakage. This design ensures the independence and objectivity of model training and evaluation. ... For model training, we adopt a subject-independent split, allocating 60%, 20% and 20% of subjects (and their corresponding samples) to the training, validation, and test sets, respectively. ... In our study, we use the processed version with 9 selected channels and 128 timestamps per sample, resulting in a total of 10,299 labeled samples. The dataset is split into training and test sets based on a certain ratio.
Hardware Specification	Yes	All experiments are conducted using four NVIDIA RTX 4090 GPUs (24GB memory) with the PyTorch framework [71].
Software Dependencies	No	All experiments are conducted using four NVIDIA RTX 4090 GPUs (24GB memory) with the PyTorch framework [71]. The learning rate is fixed at 0.0001, and the Adam [70] optimizer is employed for all experiments. While PyTorch is mentioned, a specific version number is not provided, which is required for a reproducible description of software dependencies.
Experiment Setup	Yes	The learning rate is fixed at 0.0001, and the Adam [70] optimizer is employed for all experiments. The batch sizes are set according to the dataset: {32, 256, 32, 32} for TDBrain, PTB-XL, UCI-HAR, and FLAAP, respectively. In the data preprocessing stage, we adopt the processing pipeline from Wang[49] to ensure consistency and comparability across datasets. All models are trained for 100 epochs using five different random seeds (41 to 45), and we report the average results along with the standard deviations. To prevent overfitting, we adopt an early stopping strategy based on the F1 score on the validation set. Table 7: Experiment configuration of SGN. Dataset # Groups #β # Embedding Dim # Layers # Kernels # Period Window #Channel Ratio TDBRAIN 4 0.1 32 5 7 26 2 PTB-XL 4 0.1 64 5 7 25 2 UCI-HAR 6 0.1 64 4 7 32 2 FLAAP 5 0.1 64 2 7 50 2