Learning to Unlearn: Instance-Wise Unlearning for Pre-trained Classifiers

Authors: Sungmin Cha, Sungjun Cho, Dasol Hwang, Honglak Lee, Taesup Moon, Moontae Lee

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experimentation on various image classification benchmarks, we show that our approach effectively preserves knowledge of remaining data while unlearning given instances in both single-task and continual unlearning scenarios. Experiments Experimental Setup Datasets and baselines. We evaluate our unlearning methods on three different image classification datasets: CIFAR-10, CIFAR-100 (Krizhevsky, Hinton et al. 2009), and Image Net-1K (Deng et al. 2009). We use Res Net-18 as the base model for CIFAR-10, and Res Net-50 for CIFAR100 and Image Net-1K (He et al. 2016). Experimental results from other base models such as Mobile Netv2 (Sandler et al. 2018), Squeeze Net (Iandola et al. 2016), Dense Net (Huang et al. 2017), and Vi T (Dosovitskiy et al. 2020) are also available in the supplementary material.
Researcher Affiliation Collaboration Sungmin Cha1*, Sungjun Cho2*, Dasol Hwang2*, Honglak Lee2, Taesup Moon3 and Moontae Lee2, 4 1New York University 2LG AI Research 3ASRI / INMC / Seoul National University 4University of Illinois Chicago
Pseudocode No The pseudocode of measuring weight importance is shown in Algorithm 2 of the supplementary material. [...] The pseudocode of our overall unlearning pipeline is presented in the supplementary material.
Open Source Code No No explicit statement or link providing access to source code for the methodology was found in the provided text.
Open Datasets Yes We evaluate our unlearning methods on three different image classification datasets: CIFAR-10, CIFAR-100 (Krizhevsky, Hinton et al. 2009), and Image Net-1K (Deng et al. 2009). Specifically, we first pretrain Res Net-18 on age-group classification on the UTKFace dataset consisted of 20k facial images, each labeled with the age of the subject ranging between 0 and 116 (Zhang, Song, and Qi 2017).
Dataset Splits Yes Let Dtrain be the entire training dataset used to pre-train a classification model gθ : X Y. We denote Df Dtrain as the unlearning dataset that we want to intentionally forget from the pretrained model and Dr as the remaining dataset on which we wish to maintain predictive accuracy (Dr := Dtrain \ Df). We denote a pair of an input image x X and its ground-truth label y Y from Dtrain as (x, y) Dtrain, similarly (xf, yf) Df and (xr, yr) Dr. Also, Dtest denotes the test dataset used for evaluation. For each dataset, we randomly pick k {16, 64, 128, 256} images from the entire training dataset as unlearning data Df and consider the remaining data as Dr.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments were mentioned in the provided text.
Software Dependencies No No specific software dependencies with version numbers (e.g., library names with versions) were mentioned in the provided text.
Experiment Setup Yes For unlearning, we use a SGD optimizer with a learning rate of 1e-3, weight decay of 1e-5, and momentum of 0.9 across all experiments. We early-stop training when the model attains 0% or 100% accuracy on the unlearning data Df, in case of misclassifying and relabeling, respectively. For generating adversarial examples from Df, we use L2-PGD targeted attack (Madry et al. 2017) with a learning rate of 1e-1, attack iterations of 100 and ϵ = 0.4. It generates 20 adversarial examples per image for CIFAR-10 and 200 examples per image for CIFAR-100 and Image Net-1K. For the weight importance regularization, we set regularization strength λ = 1 in Eq. 5.