MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection

Authors: Haoyang He, Yuhu Bai, Jiangning Zhang, Qingdong He, Hongxu Chen, Zhenye Gan, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Lei Xie

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments on six diverse anomaly detection datasets and seven metrics demonstrate state-of-the-art performance, substantiating the method s effectiveness.
Researcher Affiliation Collaboration 1Zhejiang University 2Youtu Lab, Tencent 3Nanyang Technological University
Pseudocode No The paper describes the architecture and method components using textual descriptions and figures (Fig. 1, Fig. 2), but does not include pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes The code and models are available at https://lewandofskee.github.io/projects/Mamba AD.
Open Datasets Yes MVTec-AD [3] encompasses a diverse collection of 5 types of textures and 10 types of objects, 5,354 high-resolution images in total. 3,629 normal images are designated for training. The remaining 1,725 images are reserved for testing and include both normal and abnormal samples. Vis A [58] features 12 different objects... Real-IAD [42] includes objects from 30 distinct categories... More results on MVTec-3D [5], as well as newly proposed Uni-Medical [50, 2] and COCO-AD [52] datasets, can be viewed in Appendix 5.
Dataset Splits No 3,629 normal images are designated for training. The remaining 1,725 images are reserved for testing and include both normal and abnormal samples. (This only specifies train and test splits, no explicit validation split.)
Hardware Specification Yes The model undergoes a training period of 500 epochs for the multi-class setting, conducted on a single NVIDIA TESLA V100 32GB GPU.
Software Dependencies No The paper mentions 'Adam W optimizer' but does not specify specific version numbers for software dependencies or programming languages used.
Experiment Setup Yes All input images are resized to a uniform size of 256 256 without additional augmentation for consistency. A pre-trained Res Net34 acts as the feature extractor, while a Mamba decoder of equivalent depth [3,4,6,3] to Res Net34 serves as the student model for training. ... The Adam W optimizer is employed with a learning rate of 0.005 and a decay rate of 1 10 4. The model undergoes a training period of 500 epochs for the multi-class setting...