Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
Authors: Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, Zico Kolter
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proposed classifier on several benchmark datasets common to the data poisoning literature. On the Dogfish binary classification challenge from Image Net, our classifier maintains 81.3% certified accuracy in the face of an adversary who could reduce an undefended classifier to less than 1%. Additional experiments on MNIST and CIFAR10 demonstrate our algorithm s effectiveness for multi-class classification. |
| Researcher Affiliation | Collaboration | 1Carnegie Mellon University 2Bosch Center for AI. Correspondence to: Elan Rosenfeld <elan@cmu.edu>. |
| Pseudocode | Yes | Algorithm 1 Randomized smoothing for label-flipping robustness |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | Following Koh & Liang (2017) and Steinhardt et al. (2017), we perform experiments on MNIST 1/7, the IMDB review sentiment dataset (Maas et al., 2011), and the Dogfish binary classification challenge taken from Image Net. We run additional experiments on multi-class MNIST and CIFAR10. |
| Dataset Splits | No | The paper specifies training and test set sizes (e.g., '13,007 training points and 2,163 test points' for MNIST 1/7, '25,000 training examples and 25,000 test examples' for IMDB, '900 training points and 300 test points' for Dogfish). However, it does not provide any information regarding validation set splits, percentages, or usage. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments. It mentions 'minimal additional runtime complexity' and 'embarrassingly parallel' but no concrete hardware. |
| Software Dependencies | No | The paper mentions using a 'high-precision arithmetic library (Johansson et al., 2013)' but does not provide a version number for it or any other software dependencies. |
| Experiment Setup | No | The paper mentions general parameters like 'noise parameter q' and 'regularization parameter λ' and discusses their effect. It also notes 'smaller levels of noise achieved higher certified test accuracy'. However, it does not provide concrete numerical values for hyperparameters such as learning rate, batch size, number of epochs, or specific optimizer settings used in the experiments. It states setting lambda by a formula, but not the final values used. |