Explainable Deep One-Class Classification
Authors: Philipp Liznerski, Lukas Ruff, Robert A. Vandermeulen, Billy Joe Franks, Marius Kloft, Klaus Robert Muller
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper we present an explainable deep one-class classification method, Fully Convolutional Data Description (FCDD), where the mapped samples are themselves also an explanation heatmap. FCDD yields competitive detection performance and provides reasonable explanations on common anomaly detection benchmarks with CIFAR-10 and Image Net. On MVTec-AD, a recent manufacturing dataset offering ground-truth anomaly maps, FCDD sets a new state of the art in the unsupervised setting. Our method can incorporate ground-truth anomaly explanations during training and using even a few of these ( 5) improves performance significantly. |
| Researcher Affiliation | Collaboration | 1ML group, Technical University of Kaiserslautern, Germany 2ML group, Technical University of Berlin, Germany 3Google Research, Brain Team, Berlin, Germany 4Department of Artificial Intelligence, Korea University, Seoul, Republic of Korea 5Max Planck Institute for Informatics, Saarbr ucken, Germany |
| Pseudocode | Yes | Algorithm 1 Receptive Field Upsampling |
| Open Source Code | Yes | equal contribution 1Our code is available at: https://github.com/liznerski/fcdd |
| Open Datasets | Yes | On MVTec-AD, a recent manufacturing dataset offering ground-truth anomaly maps, FCDD sets a new state of the art in the unsupervised setting... We first evaluate FCDD on the Fashion-MNIST, CIFAR-10, and Image Net datasets. These datasets are cited as: "CIFAR-10 (Krizhevsky et al., 2009)", "Image Net1k (Deng et al., 2009)", "Fashion-MNIST (Xiao et al., 2017)", "MVTec AD (Bergmann et al., 2019)". |
| Dataset Splits | Yes | The common AD benchmark is to utilize these classification datasets in a one-vs-rest setup where the one class is used as the nominal class and the rest of the classes are used as anomalies at test time. For training, we only use nominal samples as well as random samples from some auxiliary Outlier Exposure (OE) (Hendrycks et al., 2019a) dataset, which is separate from the ground-truth anomaly classes following Hendrycks et al. (2019a;b). |
| Hardware Specification | No | The paper discusses training and optimization details (Appendix E) and general computational performance, but it does not specify any particular hardware components such as GPU models, CPU types, or memory specifications used for running experiments. |
| Software Dependencies | Yes | We optimize the network parameters using SGD (Bottou, 2010) with Nesterov momentum (µ = 0.9) (Sutskever et al., 2013) ... We optimize the network using Adam (Kingma and Ba, 2015) (β = (0.9, 0.999))... The reference to PyTorch documentation (pytorch.org/docs/1.4.0/torchvision/transforms.html) implies PyTorch version 1.4.0. |
| Experiment Setup | Yes | We train for 400 epochs using a batch size of 128 samples. We optimize the network parameters using SGD with Nesterov momentum (µ = 0.9), weight decay of 10^-6 and an initial learning rate of 0.01, which decreases the previous learning rate per epoch by a factor of 0.98. The pre-processing pipeline is: (1) Random crop to size 28 with beforehand zero-padding of 2 pixels on all sides (2) random horizontal flipping with a chance of 50% (3) data normalization. |