Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Exploit Gradient Skewness to Circumvent Byzantine Defenses for Federated Learning

Authors: Yuchen Liu, Chen Chen, Lingjuan Lyu, Yaochu Jin, Gang Chen

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three benchmark datasets validate the effectiveness of our attack. For instance, STRIKE attack improves upon the best baseline by 57.84% against Dn C on FEMNIST dataset when there are 20% Byzantine clients. Table 1: Accuracy (mean std) under different attacks against different defenses on CIFAR-10, Image Net-12, and FEMNIST.
Researcher Affiliation	Collaboration	Yuchen Liu12* , Chen Chen3*, Lingjuan Lyu3 , Yaochu Jin4, Gang Chen12 1The State Key Laboratory of Blockchain and Data Security, Zhejiang University 2Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security 3Sony AI 4Westlake University, China
Pseudocode	Yes	The procedure of STRIKE attack is shown in Agorithm 1 in Apendix B.
Open Source Code	Yes	Code https://github.com/Yuchen Liu-a/byzantine skew
Open Datasets	Yes	Our experiments are conducted on three realworld datasets: CIFAR-10 (Krizhevsky and Hinton 2009), a subset of Image Net (Russakovsky et al. 2015) refered as Image Net-12 (Li et al. 2021) and FEMNIST (Caldas et al. 2018).
Dataset Splits	Yes	To construct our FL setup, we split CIFAR-10 (Krizhevsky and Hinton 2009) dataset in a non-IID manner among 100 clients. For more setup details, please refer to Apendix A.1. We vary Dirichlet concentration parameter β in {0.1, 0.2, 0.5, 0.7, 0.9} to study how our attack behaves under different non-IID levels.
Hardware Specification	Yes	We conduct all experiments on the same workstation with 8 Intel(R) Xeon(R) Platinum 8336C CPUs, a NVIDIA Tesla V100, and 64GB main memory running Linux platform.
Software Dependencies	No	The paper mentions 'Linux platform' as the operating system but does not provide specific software dependencies (libraries, frameworks) with version numbers.
Experiment Setup	Yes	We run Fed Avg (Mc Mahan et al. 2017) for 200 communication rounds. The detailed introduction and hyperparameter settings of these attacks are shown in Apendix D.1. The detailed hyperparameter settings of the above robust AGRs are listed in Apendix D.1.