A Diffusion-Based Framework for Multi-Class Anomaly Detection

Authors: Haoyang He, Jiangning Zhang, Hongxu Chen, Xuhai Chen, Zhishan Li, Xu Chen, Yabiao Wang, Chengjie Wang, Lei Xie

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on MVTec-AD and Vis A datasets demonstrate the effectiveness of our approach which surpasses the stateof-the-art methods
Researcher Affiliation Collaboration 1College of Control Science and Engineering, Zhejiang University 2Youtu Lab, Tencent
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper states "Code is available at https: //lewandofskee.github.io/projects/diad." However, this is a project page and not a direct link to a code repository like GitHub, GitLab, or Bitbucket, as per the strict definition provided.
Open Datasets Yes MVTec-AD (Bergmann et al. 2019) dataset simulates real-world industrial production scenarios, filling the gap in unsupervised anomaly detection. It consists of 5 types of textures and 10 types of objects, in 5,354 highresolution images from different domains. The training set contains 3,629 images with only anomaly-free samples. Vis A (Zou et al. 2022) dataset consists of a total of 10,821 high-resolution images, including 9,621 normal images and 1,200 anomaly images with 78 types of anomalies.
Dataset Splits No The paper mentions a training set and a test set, but does not explicitly describe a validation set or its split.
Hardware Specification Yes We train for 1000 epochs on a single NVIDIA Tesla V100 32GB with a batch size of 12.
Software Dependencies No The paper mentions using "Adam optimiser" and a "Gaussian filter" but does not specify version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup Yes We train for 1000 epochs on a single NVIDIA Tesla V100 32GB with a batch size of 12. Adam optimiser (Loshchilov and Hutter 2019) with a learning rate of 1e 5 is set. A Gaussian filter with σ = 5 is used to smooth the anomaly localization score. For anomaly detection, the anomaly score of the image is the maximum value of the averagely pooled anomaly localization score which undergoes 8 rounds of global average pooling operations with a size of 8 8. During inference, the initial denoising timestep T is set from 1,000. We use DDIM (Song, Meng, and Ermon 2021) as the sampler with 10 steps by default.