Generating and Reweighting Dense Contrastive Patterns for Unsupervised Anomaly Detection

Authors: Songmin Dai, Yifan Wu, Xiaoqiang Li, Xiangyang Xue

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both MVTec AD and MVTec LOCO datasets also support the aforementioned observation and demonstrate that GRAD achieves competitive anomaly detection accuracy and superior inference speed.
Researcher Affiliation Academia Songmin Dai1*, Yifan Wu1*, Xiaoqiang Li1 , Xiangyang Xue2 1School of Computer Engineering and Science, Shanghai University 2School of Computer Science, Fudan University laodar@shu.edu.cn, Victor Wu@shu.edu.cn, xqli@shu.edu.cn, xyxue@fudan.edu.cn
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets Yes To validate the effectiveness and generalizability of our approach, we perform experiments on both MVTec AD (Bergmann et al. 2019) and MVTec LOCO (Bergmann et al. 2022).
Dataset Splits Yes Each sub-dataset in MVTec AD and MVTec LOCO contains limited training images. To train competitive detectors from scratch for each small sub-dataset, we adopt general data augmentations on both normal and generated images like previous works(Bergmann et al. 2019, 2022).
Hardware Specification Yes Anomaly detection performance vs. latency per image on an NVIDIA Tesla V100 GPU.
Software Dependencies No The paper does not provide specific software dependency versions (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Patch-level Detector. Each sub-dataset in MVTec AD and MVTec LOCO contains limited training images. To train competitive detectors from scratch for each small sub-dataset, we adopt general data augmentations on both normal and generated images like previous works(Bergmann et al. 2019, 2022). For level-34, 68, and 136 detectors, the images are respectively resized into 256 256, 128 128, and 64 64. We train the detector on batches of size 128 (k + 2) for 2,000 epochs and report the accuracy of the final epoch. Each batch contains 128 randomly cropped positive patches from 4 normal images and 128 (k + 1) negative patches from 4 normal images and 4k contrastive images, where k equals the number of levels of used generated contrastive images.