Robustness of Autoencoders for Anomaly Detection Under Adversarial Impact

Authors: Adam Goodge, Bryan Hooi, See Kiong Ng, Wee Siong Ng

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In experiments on real data, these techniques led to a median improvement in AUC score of 9% in the presence of adversarial attacks in the range of those tested and 8% in their absence. 5 Experiments Deep autoencoders are trained using only data of the normal class, using the reconstruction error as the anomaly score. Table 1: Summary of the two datasets used in experiments. Table 2: AUC score for various anomaly detection methods. Table 3: Precision, Recall and F1 measures for various anomaly score thresholds for the original test set, the randomly-attacked test set and the FGSM-attacked test set for WADI (top) and SWa T (bottom) datasets. Table 4: WADI: AUC for the three different types of attack with and without defenses. Table 5: SWa T: AUC for the three different types of attack with and without defenses.
Researcher Affiliation Academia Adam Goodge1,3 , Bryan Hooi1,2 , See Kiong Ng1,2 and Wee Siong Ng3 1School of Computing, National University of Singapore 2 Institute of Data Science, National University of Singapore 3 Institute for Infocomm Research, A*STAR, Singapore adam.goodge@u.nus.edu, {dcsbhk, seekiong}@nus.edu.sg, wsng@i2r.a-star.edu.sg
Pseudocode No The paper describes methods using mathematical formulas and prose, but it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper states that "publicly available codes were used" for DAGMM and MAD-GAN, but it does not provide any link or statement indicating that the source code for their proposed APAE method or their experiments is publicly available.
Open Datasets Yes The Secure Water Treatment (SWa T) is a water treatment testbed resembling those used by Singapore s Public Utility Board. The Water Distribution (WADI) is an extension of this setup to include a network of distribution pipelines. More information about the systems and datasets can be found at their websites [i Trust Centre for Research in Cyber Security, 2019]. Table 1: Summary of the two datasets used in experiments.
Dataset Splits Yes Early stopping was used once the loss function, Huber loss, is sufficiently small for the validation set, which was 20% of total training data.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions "Py Torch library" and "Adam optimization scheme" but does not specify version numbers for these or any other software dependencies, which are necessary for full reproducibility.
Experiment Setup Yes Deep autoencoders are trained using only data of the normal class, using the reconstruction error as the anomaly score. The bottleneck hidden layer had 100 (50) neurons for the WADI (SWa T) dataset, reflecting their difference in input dimensonality. The models are trained using the Adam optimization scheme with learning rate 1 10 3 and (β1, β2) = (0.5, 0.99). Early stopping was used once the loss function, Huber loss, is sufficiently small for the validation set, which was 20% of total training data. We use N = 2000 iterations and α = 0.001 for the gradient descent step (Eq. (7)). The value of k is varied within the range of 90 to 100 to explore the difference it makes to the performance. a 10 second sliding window (with a stride of one) were concatenated into one feature vector for each data sample.