Robust anomaly detection and backdoor attack detection via differential privacy

Authors: Min Du, Ruoxi Jia, Dawn Song

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We first present a theoretical analysis on how differential privacy helps with the detection, and then conduct extensive experiments to validate the effectiveness of differential privacy in improving outlier detection, novelty detection, and backdoor attack detection.
Researcher Affiliation Academia Min Du, Ruoxi Jia, Dawn Song University of California, Berkeley {min.du,ruoxijia,dawnsong}@berkeley.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We utilize MNIST dataset composed by handwritten digits 0-9, and not MNIST dataset (Kaggle [2017]), which contains letters A-J with different fonts. ...The Hadoop file system (HDFS) log dataset (Wei Xu [2009]) is generated through running Hadoop map-reduce jobs for 48 hours on 203 Amazon EC2 nodes.
Dataset Splits Yes The original MNIST data contain 60,000 training images, and 10,000 test images, which we refer to as MNIST-train and MNIST-test respectively. ...each training dataset is constructed with a particular outlier ratio ro, such that the resulted dataset MNIST-OD-train(ro) contains 60,000 images in total, where a percentage of 1-ro are from MNIST-train, and ro are from not MNIST-train. ...As in Deep Log, our training dataset contains 4,855 normal block sessions, while the test dataset includes 553,366 normal sessions and 16,838 abnormal sessions.
Hardware Specification Yes In our experience utilizing NVIDIA Tesla V100 SXM2 GPU cards, the training time for each epoch could be up to 80 times longer.
Software Dependencies No The paper mentions software like 'scikit-learn' and refers to prior work for differential privacy (Abadi et al. [2016]), but it does not specify exact version numbers for these software dependencies to ensure reproducibility.
Experiment Setup Yes For autoencoders, the encoder network contains 3 convolutional layers with max pooling, while the decoder network contains 3 corresponding upsampling layers. For differential privacy, we use a clipping bound C = 1 and δ = 10^-5, and vary the noise scale σ as in (Abadi et al. [2016]). All models are trained with a learning rate of 0.15, a mini-batch size of 200 and for a total of 60 epochs. ...The model related parameters are: 2 layers, 256 units per layer, 10 time steps, and a batch size of 256.