reproducibilityindex.ai

Hierarchical Vector Quantized Transformer for Multi-class Unsupervised Anomaly Detection

Authors: Ruiying Lu, YuJie Wu, Long Tian, Dongsheng Wang, Bo Chen, Xiyang Liu, Ruimin Hu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	By evaluating on MVTec-AD and Vis A datasets, our model surpasses the state-of-the-art alternatives and possesses good interpretability. The code is available at https://github.com/Ruiying Lu/HVQ-Trans.
Researcher Affiliation	Academia	Ruiying Lu1 , Yu Jie Wu2 , Long Tian2* , Dongsheng Wang3 Bo Chen3, Xiyang Liu2, Ruimin Hu1 School of Cyber Engineering1, Software Engineering Institute2 National Key Laboratory of Radar Signal Processing3 Xidian University {luruiying,tianlong}@xidian.edu.cn
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/Ruiying Lu/HVQ-Trans.
Open Datasets	Yes	MVTec-AD [2] is a wildly-used industrial anomaly detection dataset with 15 classes... Vis A [45] is a recently published large dataset... CIFAR-10 [45] is a classical image classification dataset of 10 categories.
Dataset Splits	Yes	For each class, the training samples are normal while the test samples can be either normal or anomalous. In order to implement many-versus-many anomaly detection, we select 5 normal classes while the rest classes are viewed as anomalies.
Hardware Specification	Yes	Our model is trained for 1000 epochs on 2 GPUs (NVIDIA Ge Force RTX 3080 10GB) with batch size 16.
Software Dependencies	No	The paper mentions software like Efficient Net and Adam W but does not specify version numbers for any software dependencies.
Experiment Setup	Yes	The input image size of MVTec-AD is 224 224 3... The feature maps become 14 14 272, namely, the patch size is 16. Then we reduce the channel dimension of each patch into 256, followed by feeding them into a 4-layer van Trans-enc followed by the corresponding and a 4-layer VQTrans-dec. We use Adam W [53] with weight decay 0.0001 for optimization. Our model is trained for 1000 epochs... with batch size 16. The learning rate is initialized as 1 10 4 and dropped by 0.1 after 800 epochs.