z-SignFedAvg: A Unified Stochastic Sign-Based Compression for Federated Learning
Authors: Zhiwei Tang, Yanmeng Wang, Tsung-Hui Chang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted to demonstrate that the z-Sign Fed Avg can achieve competitive empirical performance on real datasets and outperforms existing schemes. Through both theoretical analyses and empirical experiments, we have shown that the z-Sign Fed Avg can perform nearly the same, sometimes even better, than the uncompressed Fed Avg and enjoy a significant reduction in the number of bits transmitted from clients to the server. |
| Researcher Affiliation | Academia | Zhiwei Tang1,2, Yanmeng Wang1,2, Tsung-Hui Chang1,2 1School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China 2Shenzhen Research Institute of Big Data, Shenzhen, China |
| Pseudocode | Yes | Algorithm 1: z-Sign Fed Avg (or z-Sign SGD when E = 1) |
| Open Source Code | No | The paper does not provide an explicit statement about releasing code or a link to a source code repository for the described methodology. |
| Open Datasets | Yes | z-Sign SGD on Non-i.i.d MNIST In this section, we consider an extremely non-i.i.d setting with the MNIST dataset (Deng 2012). z-Sign Fed Avg on EMNIST and CIFAR-10 In this section, we evaluate the performance of our proposed z-Sign Fed Avg on two classical datasets: EMNIST(Cohen et al. 2017) and CIFAR-10 (Krizhevsky and Hinton 2010). |
| Dataset Splits | Yes | Specifically, we split the dataset into 10 parts based on the labels and each client has the data of one digit only. For the EMNIST dataset, there are 3579 clients in total and 100 clients were uniformly sampled in each communication round to upload their compressed gradients. For the CIFAR-10 dataset, the training samples are partitioned among 100 clients, and each client has an associated multinomial distribution over labels drawn from a symmetric Dirichlet distribution with parameter 1. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | A simple two-layer convolutional neural network (CNN) from Pytorch tutorial (Paszke et al. 2017) was used. (No version number specified for PyTorch). |
| Experiment Setup | Yes | We fixed the client stepsize as 0.05 and 0.1 for EMNIST dataset and CIFAR-10 dataset respectively. For both dataset, we set the local batchsize as 32. The same noise scales for 1-Sign Fed Avg and -Sign Fed Avg were used: σ = 0.01 for EMNIST and σ = 0.0005 for CIFAR-10. For the number of local steps, we set E = 20 for EMNIST and E = 5 for CIFAR-10. |