Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Beyond Marginals: Learning Joint Spatio-Temporal Patterns for Multivariate Anomaly Detection

Authors: Padmaksha Roy, Almuatazbellah Boker, Lamine Mili

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments conducted on multiple public benchmark multivariate and synthetic time series datasets demonstrate that both the variants of our model consistently outperform state-of-the-art techniques, achieving higher precision, recall, and AUC-ROC scores, especially when the time series exhibits joint spatio-temporal dependencies.
Researcher Affiliation	Academia	Padmaksha Roy EMAIL Department of Electrical and Computer Engineering Virginia Tech Almuatazbellah Boker EMAIL Department of Electrical and Computer Engineering Virginia Tech Lamine Mili EMAIL Department of Electrical and Computer Engineering Virginia Tech
Pseudocode	Yes	Algorithm 1 Dependency Modeling via Log-Density of Multivariate Likelihood Algorithm 2 Dependence Modeling via True Copula Log Density Algorithm 3 Single batch update of Transformer encoder θ and latent dependency parameters ϕ
Open Source Code	Yes	1https://github.com/padmaksha18/DACLM
Open Datasets	Yes	We present experimental results on five widely-used benchmark datasets: SWa T, WADI, SMAP, MSL, and SMD Goh et al. (2017), Ahmed et al. (2017), Hundman et al. (2018), Su et al. (2019). These datasets are derived from various sources, including sensors in server machines, spacecraft, and water treatment or distribution systems.
Dataset Splits	Yes	Table 6: Details of the benchmark datasets used in our experiments. Dataset Train length Test length Anomaly % in test Dimension SWa T (2017) 495,000 449,919 12.13% 51 WADI (2017) 784,537 172,801 5.77% 123 MSL (2018) 58,317 73,729 10.53% 55 SMAP (2018) 153,183 427,617 12.79% 25 SMD (2019) 25,300 25,301 4.16% 38
Hardware Specification	No	No specific hardware details are mentioned for running the experiments. The text does not specify GPU models, CPU types, or other computing resources used.
Software Dependencies	No	We utilize the Transformer encoder layer from the Py Torch library, which comprises a self-attention mechanism and a feedforward network. (No version number provided for PyTorch or any other software).
Experiment Setup	Yes	Table 2: Comparison of performance metrics between Plain Transformer, Student-t Multivariate Model, and Studentt Copula Model across different window (context) lengths L. [...] L = 20 [...] L = 50 [...] L = 100 [...] L = 200 [...] C Hyperparameters: The following are the key hyperparameters on which our model performance depends. Window Size (L): Each multivariate time-series snippet is of length L, meaning we process L consecutive timestamps as a single input frame. ... Margin (δ) in Contrastive Loss: We define a margin δ that ensures that anomalies remain sufficiently below the normal log-likelihood region.