SANFlow: Semantic-Aware Normalizing Flow for Anomaly Detection
Authors: Daehyun Kim, Sungyong Baik, Tae Hyun Kim
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results highlight the efficacy of the proposed framework in improving the density modeling and thus anomaly detection performance. In this section, we evaluate our framework to validate its capability in both pixel-level anomaly localization and image-level anomaly detection. We conduct ablation experiments to evaluate the efficacy of each component of our framework. |
| Researcher Affiliation | Academia | Daehyun Kim1 Sungyong Baik2 Tae Hyun Kim3 Dept. of Artificial Intelligence1, Dept. of Data Science2, Dept. of Computer Science3 Hanyang University {daehyun, dsybaik, taehyunkim}@hanyang.ac.kr |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | We will release our code and data upon acceptance, and more details and results can be found in the supplementary material. |
| Open Datasets | Yes | The experiments are conducted on two commonly used datasets for unsupervised anomaly detection: STC (Shanghai Tech Campus) dataset [28] and MVTec dataset [4]. |
| Dataset Splits | No | MVTec is a dataset that consists of images of industrial products categorized into 5 texture categories and 10 object categories. To evaluate the unsupervised anomaly detection performance, the training set includes only defect-free (e.g., normal) images: 3,629 normal images are available for training while 1,725 normal and abnormal images are used as test set. Among the test images, 1,258 images contain defects (e.g., abnormal images). For data augmentation, abnormal patches undergo random vertical, horizontal flip and rotation during training, as described in 3.2. We evaluate and compare algorithms in terms of area under the receiver operating characteristic curve (AUROC) and area under the per-region-overlap curve (AUPRO), as used in [4, 20]. AUPRO scores can be found in the supplementary. STC (Shanghai Tech Campus) dataset [28] and MVTec dataset [4]. STC is a video surveillance dataset, which provides static videos of 13 different scenes of 856 × 480 resolution. It contains 274,515 frames for training and 42,883 frames for evaluation. The training set consists of only normal sequences, while the evaluation set consists of 300,308 regular frames and 42,883 irregular frames. It describes training and evaluation sets but doesn't explicitly define a validation set split. |
| Hardware Specification | Yes | At last, we use NVIDIA RTX8000 for training and the model shows near real-time performance by running at 13fps. |
| Software Dependencies | No | Adam optimizer with a learning rate of 5e-4 and 80 train epochs for training. The paper mentions software components but not with specific version numbers. |
| Experiment Setup | Yes | Similar to CFLOW-AD [20], we use K=3 scales for feature pyramid; normalizing flow consists of L=8 transformation blocks; the dimension of position embedding vector D is 512; Adam optimizer with a learning rate of 5e-4 and 80 train epochs for training. The loss weight hyperparameters λ1 and λ2 are set to be 1.0 and 0.2, respectively. |