Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Backdoor Mitigation via Invertible Pruning Masks

Authors: Kealan Dunnett, Reza Arablouei, Volkan Dedeoglu, Dimity Miller, Raja Jurdak

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our approach outperforms existing pruning-based backdoor mitigation approaches, maintains strong performance under limited data conditions, and achieves competitive results compared to state-of-the-art fine-tuning approaches. Notably, the proposed approach is particularly effective in restoring correct predictions for compromised samples after successful backdoor mitigation. In this section, we evaluate the effectiveness of the proposed IMS approach in mitigating backdoor attacks within the Backdoor Bench [7] evaluation tool. Adopting the benchmarking methodology of [8], we conduct a comprehensive evaluation across diverse experimental settings. These include 8 backdoor attacks, 4 model architectures, 3 datasets, and 3 poisoning ratios, a total of 288 distinct test cases.
Researcher Affiliation Academia Kealan Dunnett1, Reza Arablouei2, Dimity Miller1, Volkan Dedeoglu2, and Raja Jurdak1 1Queensland University of Technology, Brisbane Australia 2Data61, CSIRO, Pullenvale QLD, Australia
Pseudocode Yes In Algorithm 1, we summarize the training procedure of IMS.
Open Source Code Yes We provide the implementation of IMS on Git Hub.1 https://github.com/Who Dunnett/Backdoor Benchmark
Open Datasets Yes We utilize the CIFAR10, German Traffic Sign Recognition Benchmark (GTSRB), and Tiny-Image Net datasets, which contain 10, 43, and 200 classes, respectively. To assess the scalability of IMS to real-world datasets with larger image sizes, we evaluate its performance on Image Nette, a subset of Image Net.
Dataset Splits Yes Adopting the benchmarking methodology of [8], we conduct a comprehensive evaluation across diverse experimental settings. These include 8 backdoor attacks, 4 model architectures, 3 datasets, and 3 poisoning ratios, a total of 288 distinct test cases. For each dataset, we consider three data settings based on sample per class (SPC) values of 2, 10 and, 100.
Hardware Specification Yes All experiments were run on a 4-core CPU with 32GB of RAM and a H100 GPU.
Software Dependencies No The paper mentions the use of 'Adam W optimizer' but does not specify any software libraries with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We use the default attack configurations, as specified by Backdoor Bench [7], with poisoning ratios of 1%, 5%, and 10%. For each dataset, we consider three data settings based on sample per class (SPC) values of 2, 10 and, 100. We set ̑ = 1 in all experiments. We solve the outer subproblem by initially setting ̒ = 0 and then increasing it to ̒ = 10. Values of a and  for ̀ = 20.