Denoising Diffusion-Augmented Hybrid Video Anomaly Detection via Reconstructing Noised Frames

Authors: Kai Cheng, Yaning Pan, Yang Liu, Xinhua Zeng, Rui Feng

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the effectiveness of our proposed DMVAD, we conduct experiments on three public benchmarks: UCSD Ped2 [Li et al., 2013], CUHK Avenue [Lu et al., 2013], and Shanghai Tech [Liu et al., 2018] datasets.
Researcher Affiliation Academia Kai Cheng1 , Yaning Pan2 , Yang Liu1,3 , Xinhua Zeng1 and Rui Feng2 1Academy for Engineering and Technology, Fudan University 2School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University 3Department of Computer Science, University of Toronto
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statements about open-source code availability or links to a code repository.
Open Datasets Yes We conduct experiments on three public benchmarks: UCSD Ped2 [Li et al., 2013], CUHK Avenue [Lu et al., 2013], and Shanghai Tech [Liu et al., 2018] datasets.
Dataset Splits No The paper mentions training and testing phases ('during the training process', 'In the testing phase') and specifies batch sizes for each, but it does not provide explicit training/test/validation dataset splits or percentages, nor does it refer to predefined splits for reproduction.
Hardware Specification Yes We utilize the Py Torch [Paszke et al., 2019] framework on an NVIDIA Ge Force RTX 3090 GPU to implement our proposed DHVAD.
Software Dependencies No The paper mentions software like 'Py Torch', 'Cascade R-CNN pre-trained model', and 'Flow Net2.0 pre-trained model' but does not specify their version numbers for reproducibility.
Experiment Setup Yes When training, hyperparameter timestep K is set to 1200. The variance schedule γk (0, 1), k = 1, ..., K is defined as a small linear schedule, increasing linearly from γ1 = 10 4 to γK = 0.02. The timestep is encoded with transformer sinusoidal positional embedding [Vaswani et al., 2017]. It is noted that we set the sample distance λ = 70. We set the training batch size and testing batch size to 64 and 128, respectively.