Unlearnable Examples: Making Personal Data Unexploitable
Authors: Hanxun Huang, Xingjun Ma, Sarah Monazam Erfani, James Bailey, Yisen Wang
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically verify the effectiveness of error-minimizing noise in both sample-wise and class-wise forms. We also demonstrate its flexibility under extensive experimental settings and practicability in a case study of face recognition. |
| Researcher Affiliation | Academia | 1The University of Melbourne, VIC, Australia 2Deakin University, Geelong, VIC, Australia 3Key Lab. of Machine Perception (Mo E), School of EECS, Peking University, Beijing, China |
| Pseudocode | Yes | The detailed training pipeline is described in Algorithm 1 in the Appendix. |
| Open Source Code | Yes | Code is available at https://github.com/Hanxun H/Unlearnable-Examples. |
| Open Datasets | Yes | We apply error-minimizing noise to the training set of 4 commonly used image datasets: SVHN (Netzer et al., 2011), CIFAR-10, CIFAR-100 (Krizhevsky, 2009), and Image Net subset (the first 100 classes) (Russakovsky et al., 2015). |
| Dataset Splits | No | The paper defines clean training (Dc) and test (Dt) datasets but does not explicitly specify how the training data itself is split for validation during model training, nor does it provide percentages or sample counts for a validation set. |
| Hardware Specification | No | The paper mentions 'the LIEF HPC-GPGPU Facility hosted at the University of Melbourne' in the acknowledgments, but does not specify exact GPU models, CPU types, or other detailed hardware specifications used for the experiments. |
| Software Dependencies | No | The paper mentions optimizers like Stochastic Gradient Descent (SGD) and Adam, but does not specify the versions of the deep learning frameworks (e.g., PyTorch, TensorFlow) or other libraries used with version numbers. |
| Experiment Setup | Yes | For all experiments, we use Lp-norm with to regularize the imperceptibility, ϵ = 8/255 for CIFAR and SVHN, ϵ = 16/255 for Image Net, different ϵ is used in Appendix D for additional understandings. The iterative steps T for Equation 3 is set to 20 steps for sample-wise noise and 1 for class-wise, α is set to ϵ/10. Since the class-wise noise is generated universally for each class, small iterative steps avoid overfitting to the specific example. For the class-wise experiments, we only use 20% of the training data Dc to generate the noise δ and apply to entire training dataset Dc Du. The stop condition error rate is λ = 0.1 for c and λ = 0.01 for s. For SVHN and CIFAR-10, we set the M = 10 for CIFAR-10, M = 20 for CIFAR-100 and M = 100 for Image Net in the mini-setting. For all models and experiments, we use the Stochastic Gradient Descent (SGD) (Le Cun et al., 1998) optimizer with momentum 0.9, initial learning rate 0.025 and cosine scheduler (Loshchilov & Hutter, 2017) without the restart. We train all DNN models for 30 epochs on SVHN, 60 epochs on CIFAR-10, 100 epochs on CIFAR-100 and Image Net. |