Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
PhysDiff: A Physically-Guided Diffusion Model for Multivariate Time Series Anomaly Detection
Authors: Long Li, Wencheng Zhang, Shi Yuan, Hongle Guo, Wanghu Chen
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on five benchmark datasets and two Neur IPS-TS scenarios demonstrate that Phys Diff outperforms 18 state-of-the-art baselines, with average F1 score improvements on both standard and challenging datasets. Experimental results validate the advantages of combining principled signal decomposition with diffusion-based reconstruction for robust, interpretable anomaly detection in complex dynamic systems. |
| Researcher Affiliation | Academia | Long Li1, , Wencheng Zhang1, , Shi Yuan1, , Hongle Guo2, Wanghu Chen1, 1College of Computer Science & Engineering, Northwest Normal University 2School of Management, Northwest Normal University |
| Pseudocode | Yes | Algorithm 1 Physically-Guided Diffusion Process |
| Open Source Code | Yes | Code is available at https://anonymous.4open.science/r/Phys Diff-4726. |
| Open Datasets | Yes | We evaluated our approach using five widely recognized benchmark datasets: SMD [19], MSL [7], SMAP [7], SWa T [20], and PSM [21], plus the Neur IPS-TS dataset comprising Creditcard and GECCO subsets as detailed by Lai et al. (2021) [1]. Data labeled as normal were partitioned with 80% allocated for training and 20% for validation, ensuring the model is properly optimized on typical behavior. These datasets represent diverse domains including spacecraft telemetry, water treatment systems, and financial transactions, providing a comprehensive evaluation landscape for anomaly detection methods. |
| Dataset Splits | Yes | Data labeled as normal were partitioned with 80% allocated for training and 20% for validation, ensuring the model is properly optimized on typical behavior. These datasets represent diverse domains including spacecraft telemetry, water treatment systems, and financial transactions, providing a comprehensive evaluation landscape for anomaly detection methods. |
| Hardware Specification | Yes | Implementation Environment Experiments were conducted using Py Torch 2.1.2 on NVIDIA GTX 2080Ti with 22GB memory. |
| Software Dependencies | Yes | Implementation Environment Experiments were conducted using Py Torch 2.1.2 on NVIDIA GTX 2080Ti with 22GB memory. Our implementation includes optimizations: (1) CUDA-accelerated MAFD calculations using nvmath when available, (2) efficient time series embeddings with convolutional layers of kernel size 1, (3) optimized batch matrix multiplications for routing attention, and (4) reconstruction head with two-layer MLP, GELU activation, and dropout rate 0.2. |
| Experiment Setup | Yes | In our experiments, we implement Phys Diff with careful attention to model architecture, training procedure, and anomaly detection strategies, with all key hyperparameters summarized in Table 7. |