z-SignFedAvg: A Unified Stochastic Sign-Based Compression for Federated Learning

Authors: Zhiwei Tang, Yanmeng Wang, Tsung-Hui Chang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments are conducted to demonstrate that the z-Sign Fed Avg can achieve competitive empirical performance on real datasets and outperforms existing schemes. Through both theoretical analyses and empirical experiments, we have shown that the z-Sign Fed Avg can perform nearly the same, sometimes even better, than the uncompressed Fed Avg and enjoy a significant reduction in the number of bits transmitted from clients to the server.
Researcher Affiliation Academia Zhiwei Tang1,2, Yanmeng Wang1,2, Tsung-Hui Chang1,2 1School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China 2Shenzhen Research Institute of Big Data, Shenzhen, China
Pseudocode Yes Algorithm 1: z-Sign Fed Avg (or z-Sign SGD when E = 1)
Open Source Code No The paper does not provide an explicit statement about releasing code or a link to a source code repository for the described methodology.
Open Datasets Yes z-Sign SGD on Non-i.i.d MNIST In this section, we consider an extremely non-i.i.d setting with the MNIST dataset (Deng 2012). z-Sign Fed Avg on EMNIST and CIFAR-10 In this section, we evaluate the performance of our proposed z-Sign Fed Avg on two classical datasets: EMNIST(Cohen et al. 2017) and CIFAR-10 (Krizhevsky and Hinton 2010).
Dataset Splits Yes Specifically, we split the dataset into 10 parts based on the labels and each client has the data of one digit only. For the EMNIST dataset, there are 3579 clients in total and 100 clients were uniformly sampled in each communication round to upload their compressed gradients. For the CIFAR-10 dataset, the training samples are partitioned among 100 clients, and each client has an associated multinomial distribution over labels drawn from a symmetric Dirichlet distribution with parameter 1.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No A simple two-layer convolutional neural network (CNN) from Pytorch tutorial (Paszke et al. 2017) was used. (No version number specified for PyTorch).
Experiment Setup Yes We fixed the client stepsize as 0.05 and 0.1 for EMNIST dataset and CIFAR-10 dataset respectively. For both dataset, we set the local batchsize as 32. The same noise scales for 1-Sign Fed Avg and -Sign Fed Avg were used: σ = 0.01 for EMNIST and σ = 0.0005 for CIFAR-10. For the number of local steps, we set E = 20 for EMNIST and E = 5 for CIFAR-10.