Detection and Defense of Unlearnable Examples

Authors: Yifan Zhu, Lijia Yu, Xiao-Shan Gao

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical results on linear separability of certain unlearnable poisoned dataset and simple network-based detection methods that can identify all existing unlearnable examples, as demonstrated by extensive experiments. Our experiments on CIFAR-10, CIFAR-100, and Tiny Image Net demonstrate that all the major unlearnable examples can be effectively detected by both algorithms.
Researcher Affiliation Academia Yifan Zhu1,3, Lijia Yu2,3, Xiao-Shan Gao1,3 * 1Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China 2Institute of Software, Chinese Academy of Sciences, Beijing 100190, China 3University of Chinese Academy of Sciences, Beijing 101408, China zhuyifan@amss.ac.cn, yulijia@ios.ac.cn, xgao@mmrc.iss.ac.cn
Pseudocode Yes Algorithm 1: Simple Networks Detection; Algorithm 2: Bias-shifting Noise Test; Algorithm 3: Stronger Data Augmentations with Adversarial Noises (SDA+AN)
Open Source Code Yes Our codes are available at https://github.com/hala64/udp.
Open Datasets Yes Our experiments on CIFAR-10, CIFAR-100, and Tiny Image Net demonstrate that all the major unlearnable examples can be effectively detected by both algorithms.
Dataset Splits Yes Randomly split D into training and validation sets Dtr and Dva.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, or cloud instance specifications) are mentioned for running experiments.
Software Dependencies No No specific software dependencies with version numbers are mentioned in the paper.
Experiment Setup Yes The detection bound B is 0.7. For Algorithm 2, two columns are for bias-shifting noise ϵb = 0.5e. For simplicity we choose ϵb = 0.5e in this paper. Adversarial noises ϵ(xi) are generated by PGD attack (Madry et al. 2018) on the robustly learned network Frobust simple, where Frobust simple is obtained by adversarial training.