Model Sparsity Can Simplify Machine Unlearning
Authors: Jinghan Jia, Jiancheng Liu, Parikshit Ram, Yuguang Yao, Gaowen Liu, Yang Liu, PRANAY SHARMA, Sijia Liu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our proposals consistently benefit MU in various unlearning scenarios. A notable highlight is the 77% unlearning efficacy gain of fine-tuning (one of the simplest unlearning methods) when using sparsity-aware unlearning. Furthermore, we demonstrate the practical impact of our proposed MU methods in addressing other machine learning challenges, such as defending against backdoor attacks and enhancing transfer learning. |
| Researcher Affiliation | Collaboration | Jinghan Jia1, Jiancheng Liu1, Parikshit Ram2 Yuguang Yao1 Gaowen Liu3 Yang Liu4,5 Pranay Sharma6 Sijia Liu1,2 1Michigan State University, 2IBM Research, 3Cisco Research, 4University of California, Santa Cruz, 5Byte Dance Research, 6Carnegie Mellon University Equal contribution |
| Pseudocode | No | The paper provides mathematical derivations and describes methods in text and equations, but it does not include any explicitly labeled pseudocode blocks or algorithms. |
| Open Source Code | Yes | Codes are available at https://github.com/OPTML-Group/Unlearn-Sparse. |
| Open Datasets | Yes | Unless specified otherwise, our experiments will focus on image classification under CIFAR-10 [43] using Res Net-18 [44]. Yet, experiments on additional datasets (CIFAR-100 [43], SVHN [45], and Image Net [46]) and an alternative model architecture (VGG-16 [47]) can be found in Appendix C.4. |
| Dataset Splits | No | The paper extensively discusses training datasets (D), forgetting datasets (Df), remaining datasets (Dr), and test datasets. However, it does not provide specific details on validation splits (e.g., percentages or counts for a validation set) within its experimental setup. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only mentions accelerating Image Net training on GPUs using FFCV [54]. |
| Software Dependencies | No | The paper mentions software components like PyTorch (implied by references to models like ResNet and VGG, and training methods), and SGD as an optimizer. However, it does not specify exact version numbers for these or other ancillary software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | For all datasets and model architectures, we adopt 10 epochs for FT, and 5 epochs for GA method. The learning rate for FT and GA are carefully tuned between [10-5, 0.1] for each dataset and model architecture. In particular, we adopt 0.01 as the learning rate for FT method and 10-4 for GA on the CIFAR-10 dataset (Res Net-18, class-wise forgetting) at different sparsity levels. By default, we choose SGD as the optimizer for the FT and GA methods. As for FF method, we perform a greedy search for hyperparameter tuning [12] between 10-9 and 10-6. For all pruning methods, including IMP [15], Syn Flow [40], and OMP [17], we adopt the settings from the current SOTA implementations [17]; see a summary in Tab. A2. For IMP, OMP, and Syn Flow, we adopt the step learning rate scheduler with a decay rate of 0.1 at 50% and 75% epochs. We adopt 0.1 as the initial learning rate for all pruning methods. |