PUMA: Performance Unchanged Model Augmentation for Training Data Removal
Authors: Ga Wu, Masoud Hashemi, Christopher Srinivasa8675-8682
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the effectiveness of the PUMA framework, we compared it with multiple state-of-theart data removal techniques in the experiments, where we show the PUMA can effectively and efficiently remove the unique characteristics of marked training data without retraining the model that can 1) fool a membership attack, and 2) resist performance degradation. |
| Researcher Affiliation | Industry | Ga Wu, Masoud Hashemi, Christopher Srinivasa Borealis AI {ga.wu, masoud.hashemi, christopher.srinivasa}@borealisai.com |
| Pseudocode | No | The paper describes the methodology using equations and text but does not include a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | Datasets We conducted our experiments on two synthetic datasets, two tabular datasets from UCI data group 3, and the MNIST dataset (Le Cun et al. 1998). |
| Dataset Splits | No | The paper mentions using training data and refers to test samples in theoretical context, but does not explicitly state the training, validation, or test dataset splits (e.g., percentages or counts) for the experiments. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., GPU/CPU models, memory, cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks) used for the experiments. |
| Experiment Setup | Yes | PUMA has one important hyper-parameter η which controls the projection step of parameter augmentation. Indeed, a huge projection step η would seriously violate the Taylor approximation assumption that PUMA approach relies on. Hence, in this experiment, we aim to demonstrate the importance of tuning this hyper-parameter. Figure 5 shows the trend of tuning η on two representative datasets (UCI German Credit and MNIST). Overall, there is a trade-off between the effectiveness of removing data and the ability of preserving model generalization. Keeping the projection rate in the range of η [10 2, 10 1] often show satisfactory removal performance while maintaining the model s generalization ability. |