Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Fortifying Time Series: DTW-Certified Robust Anomaly Detection
Authors: Shijie Liu, Tansu Alpcan, Christopher Leckie, Sarah Erfani
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across various datasets and models validate the effectiveness and practicality of our theoretical approach. Results demonstrate significantly improved performance, e.g., up to 18.7% in F1-score under DTW-based adversarial attacks compared to traditional certified models. |
| Researcher Affiliation | Academia | 1Department of Electrical and Electronic Engineering University of Melbourne, Melbourne, Australia 2School of Computing and Information Systems University of Melbourne, Melbourne, Australia |
| Pseudocode | No | The paper describes the approach and its implementation but does not present a distinct pseudocode or algorithm block. Figure 2 illustrates the process but is not pseudocode. |
| Open Source Code | Yes | The code and environment file are provided in the supplemental material. |
| Open Datasets | Yes | Our empirical evaluation of the DTW-certified defense spans seven widely used benchmark datasets, including SMAP [48], MSL [27], SML [60], NIPS-TS-SWAN, NIPS-TS-CREDITCARD, NIPS-TS-WATER [30], UCR-1 ane UCR-2 [68], encompassing both univariate and multivariate time-series data. |
| Dataset Splits | Yes | Table 4: Statistics of the benchmark datasets for time-series anomaly detection. SMAP 25 135,183 427,617 13.13% MSL 55 58,317 73,729 10.72% SMD 25 708,405 708,420 4.16% NIPS-TS-SWAN 38 60,000 60,000 32.60% NIPS-TS-CREDITCARD 29 284,807 284,807 0.17% NIPS-TS-WATER 9 69,260 69,260 1.05% UCR-1 1 35,000 44,795 1.38% UCR-2 1 35,000 45,000 0.67% |
| Hardware Specification | Yes | All experiments are implemented using Py Torch and executed on a Linux server equipped with Intel(R) Xeon(R) Gold 6326 CPUs and NVIDIA A100 GPUs with 80 GB of memory. |
| Software Dependencies | No | All experiments are implemented using Py Torch and executed on a Linux server equipped with Intel(R) Xeon(R) Gold 6326 CPUs and NVIDIA A100 GPUs with 80 GB of memory. |
| Experiment Setup | Yes | We use the following default hyperparameters across all experiments unless otherwise specified: sequence length T = 50, DTW wrapping window size w = 4, number of noisy samples n = 1, 000, smoothing noise level σ = 0.5 in N(0, σ2I), and percentile p = 0.5 in the percentile-smoothed function hp. |