Robust anomaly detection and backdoor attack detection via differential privacy
Authors: Min Du, Ruoxi Jia, Dawn Song
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first present a theoretical analysis on how differential privacy helps with the detection, and then conduct extensive experiments to validate the effectiveness of differential privacy in improving outlier detection, novelty detection, and backdoor attack detection. |
| Researcher Affiliation | Academia | Min Du, Ruoxi Jia, Dawn Song University of California, Berkeley {min.du,ruoxijia,dawnsong}@berkeley.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We utilize MNIST dataset composed by handwritten digits 0-9, and not MNIST dataset (Kaggle [2017]), which contains letters A-J with different fonts. ...The Hadoop file system (HDFS) log dataset (Wei Xu [2009]) is generated through running Hadoop map-reduce jobs for 48 hours on 203 Amazon EC2 nodes. |
| Dataset Splits | Yes | The original MNIST data contain 60,000 training images, and 10,000 test images, which we refer to as MNIST-train and MNIST-test respectively. ...each training dataset is constructed with a particular outlier ratio ro, such that the resulted dataset MNIST-OD-train(ro) contains 60,000 images in total, where a percentage of 1-ro are from MNIST-train, and ro are from not MNIST-train. ...As in Deep Log, our training dataset contains 4,855 normal block sessions, while the test dataset includes 553,366 normal sessions and 16,838 abnormal sessions. |
| Hardware Specification | Yes | In our experience utilizing NVIDIA Tesla V100 SXM2 GPU cards, the training time for each epoch could be up to 80 times longer. |
| Software Dependencies | No | The paper mentions software like 'scikit-learn' and refers to prior work for differential privacy (Abadi et al. [2016]), but it does not specify exact version numbers for these software dependencies to ensure reproducibility. |
| Experiment Setup | Yes | For autoencoders, the encoder network contains 3 convolutional layers with max pooling, while the decoder network contains 3 corresponding upsampling layers. For differential privacy, we use a clipping bound C = 1 and δ = 10^-5, and vary the noise scale σ as in (Abadi et al. [2016]). All models are trained with a learning rate of 0.15, a mini-batch size of 200 and for a total of 60 epochs. ...The model related parameters are: 2 layers, 256 units per layer, 10 time steps, and a batch size of 256. |