Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Functional Virtual Adversarial Training for Semi-Supervised Time Series Classification

Authors: Qingyi Pan, Yicheng Li

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on real-world datasets to verify that f-VAT significantly outperforms other competitive baselines (e.g., up to 9.42% on Cricket X and 8.30% on Self Reg). Further visualization indicates that, compared to the original VAT, our proposed functional adversarial perturbations lead to more stable convergence and better final performance. Extensive experimental results (Section 4.2) on semi-supervised time series classification are provided to demonstrate the superiority of f-VAT over existing methods. Additional visualization results (Section 4.3) indicate that functional adversarial perturbations can significantly smooth the loss landscape to achieve stable convergence and better performance.
Researcher Affiliation	Academia	Qingyi Pan Department of Statistics and Data Science Tsinghua University Beijing, China EMAIL Yicheng Li Department of Statistics and Data Science Tsinghua University Beijing, China EMAIL
Pseudocode	Yes	Algorithm 1 Functional Virtual Adversarial Training Step 1: Input: Data batch D, Dl, model fθ, order of the Sobolev norm s 0, radius ϵ, adversarial iterations L, learning rate η. 2: for each sample Xi D do Approximate r i 3: Randomly initialize perturbation vector ri over ri H s ϵ. 4: for ℓ= 1 L do 5: Gradient ascent ri ri + η ri LDS(Xi, ri; fθ) 6: Normalize ri ϵ ri ri H s . 7: end for 8: end for 9: θ θ η θL(θ), where L(θ) = L0(Dl; fθ) + 1 \|D\| P Xi D LDS(Xi, ri; fθ)
Open Source Code	Yes	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We include our codes and instructions to reproduce our main experimental results in the supplemental material.
Open Datasets	Yes	We use dozens of publicly available datasets from the UCR and UEA repositories [11], including the representative univariate dataset (i.e., Cricket X, UWave, and Insect Wing) and the multivariate dataset (i.e., Self Reg, NATOPS, and Heartbeat) in [19]. These representative datasets are from difficult to easy, and widely-used in semi-supervised time series classification [19, 21]. Additionally, we construct more empirical results on several large-scale China Securities Index (CSI) datasets (i.e., CSI 50 and 500 futures) spanning from 2020 to 2023 for predicting directions (upward or downward) of futures prices [44, 33]. The dataset collects records spanning from 2020 to 2022.
Dataset Splits	Yes	Following [19], each dataset is split into train (60%), valid (20%), and test set (20%).
Hardware Specification	Yes	We run our experiments on eight NVIDIA A10 GPUs (each with 24 GB memory).
Software Dependencies	No	The paper does not explicitly provide specific version numbers for software dependencies such as programming languages, deep learning frameworks, or other libraries in the provided text. While the NeurIPS checklist indicates that code and instructions for reproduction are in the supplemental material, these specific version details are not present in the accessible paper text.
Experiment Setup	Yes	We use stochastic gradient descent with a learning rate of 10 3. The batch size is set to 64 with a maximum of 300 epochs. Due to the model-agnostic properties of f-VAT, we use an eight-layers Temporal Convolutional Network (TCN) [5] as the backbone architecture to compare with other competitive baselines.