reproducibilityindex.ai

Robust Learning for Data Poisoning Attacks

Authors: Yunjuan Wang, Poorya Mianjy, Raman Arora

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our theoretical results with empirical evaluations on real datasets in Section 5. ... The goal of this section is to provide experimental support for our theoretical ﬁndings in Section 2 and Section 3. Code is available on Github 1. First, we describe the experimental setup. Datasets. We utilize the MNIST and the CIFAR10 datasets for the empirical evaluation.
Researcher Affiliation	Academia	Yunjuan Wang 1 Poorya Mianjy 1 Raman Arora 1 1Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The goal of this section is to provide experimental support for our theoretical ﬁndings in Section 2 and Section 3. Code is available on Github 1. 1https://github.com/bettyttytty/robust_ learning_for_data_poisoning_attack
Open Datasets	Yes	Datasets. We utilize the MNIST and the CIFAR10 datasets for the empirical evaluation. MNIST is a dataset of 28 28 greyscale handwritten digits, containing 70K samples in 10 classes, with 60K training images and 10K test images. CIFAR10 is a dataset of 32 32 color images, containing 60K samples in 10 classes, with 50K training images and 10K test images.
Dataset Splits	Yes	MNIST is a dataset of 28 28 greyscale handwritten digits, containing 70K samples in 10 classes, with 60K training images and 10K test images. CIFAR10 is a dataset of 32 32 color images, containing 60K samples in 10 classes, with 50K training images and 10K test images. ... We perform 5-fold cross validation to pick model parameters including learning rate.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions 'Pytorch initialization' and 'vanilla SGD' but does not provide specific version numbers for these or any other software components or libraries.
Experiment Setup	Yes	We initialize the networks using Pytorch initialization and train them using cross-entropy loss. We track the test accuracy of the networks as a function of the width to verify our theorems. ... We use vanilla SGD with batch size 128 (no momentum, no weight decay, no data augmentation). Each curve is averaged over 50 runs and shaded regions show standard deviation. We perform 5-fold cross validation to pick model parameters including learning rate.