Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FrameShield: Adversarially Robust Video Anomaly Detection

Authors: Mojtaba Nafez, Mobina Poulaei, Nikan Vasei, Bardia moakhar, Mohammad Sabokrou, Mohammad Hossein Rohban

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that our method significantly enhances the robustness of WSVAD models against adversarial attacks, outperforming state-of-the-art methods by an average of 71.0% in overall AUROC performance across multiple benchmarks.
Researcher Affiliation	Academia	Mojtaba Nafez Department of Computer Engineering Sharif University of Technology EMAIL; Mobina Poulaei Department of Computer Engineering Sharif University of Technology EMAIL; Nikan Vasei Department of Computer Engineering Sharif University of Technology EMAIL; Bardia Soltani Moakhar Department of Industrial Engineering Sharif University of Technology EMAIL; Mohammad Sabokrou Machine Learning and Data Science Unit Okinawa Institute of Science and Technology EMAIL; Mohammad Hossein Rohban Department of Computer Engineering Sharif University of Technology EMAIL
Pseudocode	Yes	T Pseudocode of Frame Shield We present the pseudocode in Algorithm 1 for our Frame Shield framework, which comprises two sequential phases: (1) weakly supervised training using the proposed Prompt MIL formulation, and (2) fully supervised adversarial training using pseudo-anomalies generated by our SRD (Spatiotemporal Region Distortion) module.
Open Source Code	Yes	The implementation and code are publicly available at https://github.com/rohban-lab/Frame Shield.
Open Datasets	Yes	We evaluate our method on well-established benchmarks: MSAD Zhu et al. [2024], UCF-Crime Sultani et al. [2018], Shanghai Tech Liu et al. [2018], TAD Lv et al. [2021], and UCSD Ped2 Mahadevan et al. [2010]; additional details are provided in Appendix A.
Dataset Splits	Yes	MSAD: The dataset is split into a training set with 480 videos (120 abnormal / 360 normal) and a test set with 440 videos (120 abnormal / 320 normal). UCF Crime: The training set includes video-level annotations with 800 normal and 810 abnormal videos, while the testing set provides frame-level labels for 140 normal and 150 abnormal videos... The final dataset used for training included 410 normal and 410 abnormal videos, and the test set comprised 75 normal and 75 abnormal videos. Shanghai Tech: This results in 238 training videos (63 abnormal and 175 normal) and 199 testing videos (44 abnormal and 155 normal)... TAD: The dataset is divided into a training set of 400 videos and a testing set of 100 videos, which includes 60 normal and 40 abnormal instances. UCSD-Ped2: ...the dataset is restructured by randomly selecting six anomalous and four normal videos for training, while the remaining 18 videos (12 normal and 6 anomalous) are used for testing. This sampling process is repeated ten times...
Hardware Specification	Yes	We conducted our experiments on 2 NVIDIA Ge Force RTX 4090 GPUs (24 GB), with the pipeline completing in approximately 30 hours.
Software Dependencies	No	We conducted adversarial training for 40 epochs using the Adam W optimizer with a learning rate of 8 10 6, a chunk size of 16 frames and ϵ = 0.5 255. A cosine scheduler was employed to gradually decrease the learning rate. Additionally, to train Prompt MIL with X-Clip Ma et al. [2022] as a feature extractor and get pseudo-labels, we required approximately 4 hours.
Experiment Setup	Yes	For training, we used a learning rate of 8 10 6 with a chunk size of 16 frames. The model was trained for 40 epochs using the Adam W optimizer, which effectively incorporates weight decay. To schedule the learning rate, we applied a Cosine scheduler, which progressively reduces the learning rate following a cosine decay pattern.