Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SPMC: Self-Purifying Federated Backdoor Defense via Margin Contribution

Authors: Wenwen He, Wenke Huang, Bin Yang, Shukan Liu, Mang Ye

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on a variety of classification benchmarks demonstrate that SPMC achieves strong defense performance against sophisticated backdoor attacks without sacrificing accuracy on benign tasks.
Researcher Affiliation	Academia	1National Engineering Research Center for Multimedia Software, School of Computer Science, Wuhan University, Wuhan, China 2School of National Cyber Security, Wuhan University, Wuhan, China 3School of Electronic Engineering, Naval University of Engineering, Wuhan, China. Correspondence to: Bin Yang <EMAIL>, Mang Ye <EMAIL>.
Pseudocode	Yes	B. Algorithm We provide the algorithm description in Algorithm 1. Algorithm 1 SPMC
Open Source Code	Yes	The code is posted at: https://github.com/WenddHe0119/SPMC.
Open Datasets	Yes	MNIST (Le Cun et al., 1998) is a handwritten digit dataset of 10 digits class (1 9) with 70, 000 images. CIFAR-10 (Krizhevsky & Hinton, 2009) has 10 semantics with 50k, 10k images for training, validation. CIFAR-100 is a collection of 60,000 32 32 color images in 100 classes, with 600 images per class, commonly used for image classification tasks in machine learning and computer vision. Fashion MNIST (Xiao et al., 2017) is a dataset of 70,000 grayscale images of 10 fashion categories.
Dataset Splits	Yes	CIFAR-10 (Krizhevsky & Hinton, 2009) has 10 semantics with 50k, 10k images for training, validation. ... As for the data heterogeneity simulation, we utilize the Dirichlet distribution β to simulate the label skew, as previous methods (Li et al., 2020a; 2021; Zhang et al., 2022a; Huang et al., 2023c), where β > 0 is the concentration parameter to adjust the class-wise skew level. We set β = 0.5 to simulate the heterogeneous federated network for following experiments.
Hardware Specification	No	The supercomputing system at the Supercomputing Center of Wuhan University supported the numerical calculations in this paper.
Software Dependencies	No	The paper mentions using SGD optimizer with specific learning rate, weight decay, momentum, and batch size, but does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	For model efficiency and algorithmic convergence consideration, we conduct communication epochs for E = 50 for three datasets. We set local updating round T = 10, where all federated learning approaches have little or no accuracy gain with more communications. We use the SGD optimizer with the learning rate η = 0.01 for all approaches. The corresponding weight decay is 1e-5 and momentum is 0.9. The training batch size is 64.