Robust Learning for Data Poisoning Attacks

Authors: Yunjuan Wang, Poorya Mianjy, Raman Arora

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our theoretical results with empirical evaluations on real datasets in Section 5. ... The goal of this section is to provide experimental support for our theoretical findings in Section 2 and Section 3. Code is available on Github 1. First, we describe the experimental setup. Datasets. We utilize the MNIST and the CIFAR10 datasets for the empirical evaluation.
Researcher Affiliation Academia Yunjuan Wang 1 Poorya Mianjy 1 Raman Arora 1 1Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The goal of this section is to provide experimental support for our theoretical findings in Section 2 and Section 3. Code is available on Github 1. 1https://github.com/bettyttytty/robust_ learning_for_data_poisoning_attack
Open Datasets Yes Datasets. We utilize the MNIST and the CIFAR10 datasets for the empirical evaluation. MNIST is a dataset of 28 28 greyscale handwritten digits, containing 70K samples in 10 classes, with 60K training images and 10K test images. CIFAR10 is a dataset of 32 32 color images, containing 60K samples in 10 classes, with 50K training images and 10K test images.
Dataset Splits Yes MNIST is a dataset of 28 28 greyscale handwritten digits, containing 70K samples in 10 classes, with 60K training images and 10K test images. CIFAR10 is a dataset of 32 32 color images, containing 60K samples in 10 classes, with 50K training images and 10K test images. ... We perform 5-fold cross validation to pick model parameters including learning rate.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions 'Pytorch initialization' and 'vanilla SGD' but does not provide specific version numbers for these or any other software components or libraries.
Experiment Setup Yes We initialize the networks using Pytorch initialization and train them using cross-entropy loss. We track the test accuracy of the networks as a function of the width to verify our theorems. ... We use vanilla SGD with batch size 128 (no momentum, no weight decay, no data augmentation). Each curve is averaged over 50 runs and shaded regions show standard deviation. We perform 5-fold cross validation to pick model parameters including learning rate.