reproducibilityindex.ai

Unlearnable Examples: Making Personal Data Unexploitable

Authors: Hanxun Huang, Xingjun Ma, Sarah Monazam Erfani, James Bailey, Yisen Wang

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically verify the effectiveness of error-minimizing noise in both sample-wise and class-wise forms. We also demonstrate its ﬂexibility under extensive experimental settings and practicability in a case study of face recognition.
Researcher Affiliation	Academia	1The University of Melbourne, VIC, Australia 2Deakin University, Geelong, VIC, Australia 3Key Lab. of Machine Perception (Mo E), School of EECS, Peking University, Beijing, China
Pseudocode	Yes	The detailed training pipeline is described in Algorithm 1 in the Appendix.
Open Source Code	Yes	Code is available at https://github.com/Hanxun H/Unlearnable-Examples.
Open Datasets	Yes	We apply error-minimizing noise to the training set of 4 commonly used image datasets: SVHN (Netzer et al., 2011), CIFAR-10, CIFAR-100 (Krizhevsky, 2009), and Image Net subset (the ﬁrst 100 classes) (Russakovsky et al., 2015).
Dataset Splits	No	The paper defines clean training (Dc) and test (Dt) datasets but does not explicitly specify how the training data itself is split for validation during model training, nor does it provide percentages or sample counts for a validation set.
Hardware Specification	No	The paper mentions 'the LIEF HPC-GPGPU Facility hosted at the University of Melbourne' in the acknowledgments, but does not specify exact GPU models, CPU types, or other detailed hardware specifications used for the experiments.
Software Dependencies	No	The paper mentions optimizers like Stochastic Gradient Descent (SGD) and Adam, but does not specify the versions of the deep learning frameworks (e.g., PyTorch, TensorFlow) or other libraries used with version numbers.
Experiment Setup	Yes	For all experiments, we use Lp-norm with to regularize the imperceptibility, ϵ = 8/255 for CIFAR and SVHN, ϵ = 16/255 for Image Net, different ϵ is used in Appendix D for additional understandings. The iterative steps T for Equation 3 is set to 20 steps for sample-wise noise and 1 for class-wise, α is set to ϵ/10. Since the class-wise noise is generated universally for each class, small iterative steps avoid overﬁtting to the speciﬁc example. For the class-wise experiments, we only use 20% of the training data Dc to generate the noise δ and apply to entire training dataset Dc Du. The stop condition error rate is λ = 0.1 for c and λ = 0.01 for s. For SVHN and CIFAR-10, we set the M = 10 for CIFAR-10, M = 20 for CIFAR-100 and M = 100 for Image Net in the mini-setting. For all models and experiments, we use the Stochastic Gradient Descent (SGD) (Le Cun et al., 1998) optimizer with momentum 0.9, initial learning rate 0.025 and cosine scheduler (Loshchilov & Hutter, 2017) without the restart. We train all DNN models for 30 epochs on SVHN, 60 epochs on CIFAR-10, 100 epochs on CIFAR-100 and Image Net.