Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Certified Adversarial Robustness with Additive Noise
Authors: Bai Li, Changyou Chen, Wenlin Wang, Lawrence Carin
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our evaluation on MNIST, CIFAR-10 and Image Net suggests that the proposed method is scalable to complicated models and large data sets, while providing competitive robustness to state-of-the-art provable defense methods. We conduct a comprehensive set of experiments to evaluate both the theoretical and empirical performance of our methods, with results that are competitive with the state of the art. |
| Researcher Affiliation | Academia | Bai Li Department of Statistical Science Duke University EMAIL Changyou Chen Department of CSE University at Buffalo, SUNY EMAIL Wenlin Wang Department of ECE Duke University EMAIL Lawrence Carin Department of ECE Duke University EMAIL |
| Pseudocode | Yes | Algorithm 1 Certified Robust Classifier |
| Open Source Code | Yes | The source code can be found at https://github.com/Bai-Li/STN-Code. |
| Open Datasets | Yes | Our evaluation on MNIST, CIFAR-10 and Image Net suggests that the proposed method is scalable to complicated models and large data sets, while providing competitive robustness to state-of-the-art provable defense methods. We perform experiments on the MNIST and CIFAR-10 data sets, to evaluate the theoretical and empirical performance of our methods. We subsequently also consider the larger Image Net dataset. |
| Dataset Splits | No | The paper mentions using a "test set" but does not specify explicit train/validation/test splits by percentages or sample counts, nor does it cite a predefined split with specific details for reproducibility beyond implicitly using standard dataset splits. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running the experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For the MNIST data set, the model architecture follows the models used in [36], which contains two convolutional layers, each containing 64 filters, followed with a fully connected layer of size 128. For the CIFAR-10 dataset, we use a convolutional neural network with seven convolutional layers along with Max Pooling. In both datasets, image intensities are scaled to [0, 1], and the size of attacks are also rescaled accordingly. In all our subsequent experiments, we use the end points (lower for p(1) and upper for p(2)) of the 95% confidence intervals for estimating p(1) and p(2), and multiply 95% for the corresponding accuracy. In practice, we find a sample size of n = 100 is sufficient. In previous experiments, we use σ = 0.7 and σ = 100/255 for MNIST and CIFAR-10, respectively. |