Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Potion: Towards Poison Unlearning
Authors: Stefan Schoepf, Jack Foster, Alexandra Brintrup
DMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our method heals 93.72% of poison compared to SSD with 83.41% and full retraining with 40.68%. We achieve this while also lowering the average model accuracy drop caused by unlearning from 5.68% (SSD) to 1.41% (ours). We further show the generalisation capabilities of our method on additional poison types with a Vision Transformer and a significantly larger dataset using ILSVRC Imagenet. |
| Researcher Affiliation | Academia | Stefan Schoepf EMAIL University of Cambridge, UK & The Alan Turing Institute, UK Jack Foster EMAIL University of Cambridge, UK & The Alan Turing Institute, UK Alexandra Brintrup EMAIL University of Cambridge, UK & The Alan Turing Institute, UK |
| Pseudocode | Yes | Algorithm 1 PTN search with FIM parameter importance estimation |
| Open Source Code | No | The paper does not explicitly state that source code is provided for the methodology described, nor does it provide a direct link to a code repository. The OpenReview link is for the review process, not code. |
| Open Datasets | Yes | We benchmark our contributions using Res Net-9 on CIFAR10 and Wide Res Net-28x10 on CIFAR100 with 0.2%, 1%, and 2% of the data poisoned and discovery shares ranging from a single sample to 100%. All experiments are performed on Imagenette (Howard, 2019) as an additional smaller datasets, as well as the Image Net Large Scale Visual Recognition Challenge (ILSVRC) datset (Russakovsky et al., 2015) as a large dataset going far beyond the complexity of CIFAR100. |
| Dataset Splits | Yes | Goel et al. (2024) uses three Sm sizes for each datasets, which are set as |Sm| = [100, 500, 1000] or respectively [0.2%, 1%, 2%] of the whole data. We use the same Bad Net poisoning attack of Gu et al. (2019) as an adversarial attack to insert a trigger pattern of 0.3% white pixels that redirects to class zero as Goel et al. (2024). We train Res Net-9 for 4000 epochs and Wide Res Net-28x10 for 6000 epochs as set by Goel et al. (2024). EU uses the same hyperparameter settings and epochs as used for the original training. Goel et al. (2024) do not only tune α but also λ of SSD with a relative relationship to further improve results. They use α = [0.1, 1, 10, 50, 100, 500, 1000, 1e4, 1e5, 1e6] and λ = [0.1α, 0.5α, α, 5α, 10α] and pick the best result for each datapoint based on an equally weighted average of change in poison unlearned and validation accuracy. All poison attacks are shown on an example image in Fig. 12. The amount of poisoned samples and scenarios per dataset + poison attack combination replicates Goel et al. (2024) using Sm=[100, 500, 1000] with Sf=[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0] Sm. |
| Hardware Specification | Yes | All models are trained on an NVIDIA RTX4090 with Intel Xeon processors. |
| Software Dependencies | No | The paper mentions the use of Adam W (Loshchilov and Hutter, 2017) as an optimizer, but does not provide specific version numbers for any software libraries or frameworks (e.g., PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | We train Res Net-9 for 4000 epochs and Wide Res Net-28x10 for 6000 epochs as set by Goel et al. (2024). EU uses the same hyperparameter settings and epochs as used for the original training. Goel et al. (2024) do not only tune α but also λ of SSD with a relative relationship to further improve results. They use α = [0.1, 1, 10, 50, 100, 500, 1000, 1e4, 1e5, 1e6] and λ = [0.1α, 0.5α, α, 5α, 10α] and pick the best result for each datapoint based on an equally weighted average of change in poison unlearned and validation accuracy. All models are trained on an NVIDIA RTX4090 with Intel Xeon processors. For the PTN parameters we set ρ = 20%, bstart = 25, and sstep = 1.1. ρ is motivated by the 10 classes in CIFAR10 where we could naively expect that a tenth might redirect to the real label. To avoid an unfair advantage in benchmarking, we choose a conservative value of ρ = 20% as might be done in practice. The following sensitivity analysis shows that this is not the ideal ρ value but XLF outperforms the previous SOTA in a wide range of ρ values. bstart = 25 is set to ensure we start outside the critical area shown in Fig. 6 and can be chosen lower in practice for added computational efficiency. But as described, the compute expensive part of PTN lies in the importance calculation with the search aspect running at approximately the inference speed on the small set of Sf. sstep = 1.1 is set to 10% increments to avoid overshooting and can be set more aggressively in practice. We keep the training parameters the same as Goel et al. (2024) with the only notable changes being a lower learning rate of 0.00025 (versus 0.025) along with the use of Adam W (Loshchilov and Hutter, 2017) instead of SGD. |