Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection

Authors: Bo Zong, Qi Song, Martin Renqiang Min, Wei Cheng, Cristian Lumezanu, Daeki Cho, Haifeng Chen

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on several public benchmark datasets show that, DAGMM significantly outperforms state-of-the-art anomaly detection techniques, and achieves up to 14% improvement based on the standard F1 score.
Researcher Affiliation Collaboration NEC Laboratories America Washington State University, Pullman {bzong, renqiang, weicheng, lume, dkcho, haifeng}@nec-labs.com qsong@eecs.wsu.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets Yes We employ four benchmark datasets: KDDCUP, Thyroid, Arrhythmia, and KDDCUP-Rev. KDDCUP. The KDDCUP99 10 percent dataset from the UCI repository (Lichman (2013)) Thyroid. The Thyroid (Lichman (2013)) dataset is obtained from the ODDS repository 1. 1http://odds.cs.stonybrook.edu/ Arrhythmia. The Arrhythmia (Lichman (2013)) dataset is also obtained from the ODDS repository.
Dataset Splits No The paper mentions '50% of data by random sampling for training with the rest 50% reserved for testing', but does not specify a separate validation dataset split.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions 'tensorflow (Abadi et al. (2016))' and 'Adam (Kingma & Ba (2015)) algorithm' but does not specify their version numbers.
Experiment Setup Yes The network structures of DAGMM used on individual datasets are summarized as follows. KDDCUP. [...] FC(120, 60, tanh)-FC(60, 30, tanh)-FC(30, 10, tanh)-FC(10, 1, none)-FC(1, 10, tanh)-FC(10, 30, tanh)-FC(30, 60, tanh)-FC(60, 120, none), and the estimation network performs with FC(3, 10, tanh)-Drop(0.5)-FC(10, 4, softmax). [...] trained by Adam (Kingma & Ba (2015)) algorithm with learning rate 0.0001. For KDDCUP, Thyroid, Arrhythmia, and KDDCUP-Rev, the number of training epochs are 200, 20000, 10000, and 400, respectively. For the sizes of mini-batches, they are set as 1024, 1024, 128, and 1024, respectively. Moreover, in all the DAGMM instances, we set λ1 as 0.1 and λ2 as 0.005.