Improved Certified Defenses against Data Poisoning with (Deterministic) Finite Aggregation

Authors: Wenxiao Wang, Alexander J Levine, Soheil Feizi

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our proposed Finite Aggregation consistently improves certificates on MNIST, CIFAR-10, and GTSRB, boosting certified fractions by up to 3.05%, 3.87% and 4.77%, respectively, while keeping the same clean accuracies as DPA s, effectively establishing a new state of the art in (pointwise) certified robustness against data poisoning.
Researcher Affiliation Academia 1Department of Computer Science, University of Maryland, College Park, Maryland, USA. Correspondence to: Wenxiao Wang <wwx@umd.edu>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it include a specific repository link or explicit code release statement.
Open Datasets Yes We evaluate our method on MNIST (Le Cun et al., 1998), CIFAR-10 (Krizhevsky, 2009) and GTSRB (Stallkamp et al., 2012) datasets, which are respectively 10-way classification of handwritten digits, 10-way object classification and 43-way classification of traffic signs.
Dataset Splits No The paper refers to 'training set' and 'testing set' but does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, or test sets explicitly for reproducibility.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions software like 'Network-In Network' architecture and refers to 'hyperparameters from (Gidaris et al., 2018)', but it does not list specific version numbers for any software components or libraries required for reproducibility.
Experiment Setup Yes Training hyperparameters. We use the Network-In Network (Lin et al., 2014) architecture, trained with the hyperparameters from (Gidaris et al., 2018). On MNIST and GTSRB, we also exclude horizontal flips in data augmentations as in (Levine & Feizi, 2021). The effect of k: accuracy vs. robustness. The effect of d: efficiency vs. robustness.