Not All Poisons are Created Equal: Robust Training against Data Poisoning

Authors: Yu Yang, Tian Yu Liu, Baharan Mirzasoleiman

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our extensive experiments show that our method significantly decreases the success rate of state-of-the-art targeted attacks, including Gradient Matching and Bullseye Polytope, and easily scales to large datasets1.
Researcher Affiliation Academia 1Department of Computer Science, University of California, Los Angeles, United States. Correspondence to: Yu Yang <yuyang@cs.ucla.edu>.
Pseudocode Yes Algorithm 1 Effective Poison Identification (EPIC)
Open Source Code Yes Code is available at https://github.com/Yu Yang0901/Epic.
Open Datasets Yes For our evaluation, we use the standardized data poisoning benchmark (Schwarzschild et al., 2020)...
Dataset Splits No The paper refers to using a "standardized data poisoning benchmark (Schwarzschild et al., 2020)" and reports "validation accuracy", but does not explicitly provide the specific percentages or sample counts for training, validation, and test splits within the provided text.
Hardware Specification Yes We report the wall-clock time of training a model with each defense on single NVIDIA A40 GPU with 4 workers.
Software Dependencies No The paper does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup Yes For our evaluation, we use the standardized data poisoning benchmark (Schwarzschild et al., 2020), with 200 training epochs, starting learning rate of 0.1 and decaying factor of 10 at epochs 100, 150. ... train ResNet-18 from scratch with 128 examples per mini-batch. ... In the experiments, we set K = 10 for EPIC-0.1, K = 20 for EPIC-0.2 and K = 30 for EPIC-0.3. ... Here, we apply EPIC with T =1.