Rethinking Reverse Distillation for Multi-Modal Anomaly Detection
Authors: Zhihao Gu, Jiangning Zhang, Liang Liu, Xu Chen, Jinlong Peng, Zhenye Gan, Guannan Jiang, Annan Shu, Yabiao Wang, Lizhuang Ma
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our MMRD outperforms recent state-of-the-art methods on both anomaly detection and localization on MVTec-3D AD and Eyecandies benchmarks. |
| Researcher Affiliation | Collaboration | Zhihao Gu1*, Jiangning Zhang2, Liang Liu2, Xu Chen2, Jinlong Peng2, Zhenye Gan2, Guannan Jiang3, Annan Shu3, Yabiao Wang2, Lizhuang Ma1 1School of Electronic and Electrical Engineering, Shanghai Jiao Tong University 2You Tu Lab, Tencent 3Contemporary Amperex Technology Co. Limited (CATL) |
| Pseudocode | No | The overall paradigm is shown in Fig. 3 and the algorithm table summarizing the proposed method is included in the supplementary material. |
| Open Source Code | No | Codes will be available upon acceptance. |
| Open Datasets | Yes | We conduct experiments on two multi-modal benchmarks, i.e., the MVTec 3D-AD (Bergmann et al. 2022) and the Eyecandies (Bonfiglioli et al. 2022). |
| Dataset Splits | No | The paper mentions training data and evaluation metrics, but it does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or sample counts) needed for reproduction. |
| Hardware Specification | No | The paper mentions 'GPUH' (GPU hours) and 'FPS' (frames per second) in Table 3, indicating the use of GPUs, but it does not specify any particular GPU model (e.g., NVIDIA A100, RTX 3090) or CPU model used for the experiments. |
| Software Dependencies | No | The paper mentions 'Adam' as the optimizer but does not provide specific version numbers for any software dependencies, such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Images are resized into 256 256 and Adam is used as the optimizer with a learning rate of 0.001. The model is trained for 400 epochs of batch size 16. the number of prototypes is set 50. The teacher network is a pre-trained Wide Res Net50 and the student is the same as RD. We adopt the depth and normals as auxiliary modalities for MVTec 3D-AD and Eyecandies datasets, respectively. |