Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting

Authors: Yuchen Liu, Chen Chen, Lingjuan Lyu, Fangzhao Wu, Sai Wu, Gang Chen

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various real-world datasets verify the efficacy of our proposed GAS.
Researcher Affiliation Collaboration Yuchen Liu 1 * Chen Chen 2 * Lingjuan Lyu 2 Fangzhao Wu 3 Sai Wu 1 Gang Chen 1 1Key Lab of Intelligent Computing Based Big Data of Zhejiang Province, Zhejiang University, Hangzhou, China 2Sony AI 3Microsoft.
Pseudocode No The paper describes the proposed GAS approach in three steps (Splitting, Identification, Aggregation) in paragraph form, but it does not include a formal pseudocode block or algorithm box.
Open Source Code Yes The implementation code is provided in https://github. com/Yuchen Liu-a/byzantine-gas.
Open Datasets Yes Our experiments are conducted on four real-world datasets: CIFAR-10 (Krizhevsky et al., 2009), CIFAR100 (Krizhevsky et al., 2009), a subset of Image Net (Russakovsky et al., 2015) refered as Image Net-12 (Li et al., 2021b) and FEMNIST (Caldas et al., 2018).
Dataset Splits Yes For each client, we randomly sample 0.9 portion of data as training data and let the rest 0.1 portion of data be test data by following Caldas et al. (2018).
Hardware Specification No The paper mentions model architectures like Alex Net, Squeeze Net, ResNet-18, and CNN but does not specify the hardware (e.g., GPU models, CPU types, or cloud instances) used for running the experiments.
Software Dependencies No The paper mentions using SGD optimizer and refers to various existing robust AGRs but does not provide specific version numbers for any software libraries or programming languages used in the implementation (e.g., Python, PyTorch, TensorFlow).
Experiment Setup Yes For local training, the number of local epochs is set to 1, batch size is set to 64, the optimizer is set to SGD. For SGD optimizer, learning rate is set to 0.1, momentum is set to 0.5, weight decay coefficient is set to 0.0001. We also adopt gradient clipping with clipping norm 2.