DoCoFL: Downlink Compression for Cross-Device Federated Learning
Authors: Ron Dorfman, Shay Vargaftik, Yaniv Ben-Itzhak, Kfir Yehuda Levy
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive evaluation, we show that Do Co FL offers significant bi-directional bandwidth reduction while achieving competitive accuracy to that of a baseline without any compression. We cover a wide range of use cases that include two image classification and two language processing tasks with different configurations and data partitioning, as shortly summarized in Table 2 and further detailed in Appendix F. |
| Researcher Affiliation | Collaboration | 1VMware Research 2Viterby Faculty of Electrical and Computer Engineering, Technion, Haifa, Israel. Correspondence to: Ron Dorfman <rdorfman@campus.technion.ac.il>. |
| Pseudocode | Yes | Algorithm 1 Do Co FL Parameter Server, Algorithm 2 Do Co FL Client i, Algorithm 3 Meta-Algorithm (generalization of Do Co FL), Algorithm 4 Entropy-Constrained Uniform Quantization (ECUQ) |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide a link to a code repository. |
| Open Datasets | Yes | We use the CIFAR-100 and EMNIST datasets. For CIFAR-100 (Krizhevsky et al., 2009), the data distribution among the clients is i.i.d. For EMNIST (Cohen et al., 2017)... For language processing, we perform a sentiment analysis task on the Amazon Reviews dataset (Zhang et al., 2015) with i.i.d data partitioning; and a next-character prediction task on the Shakespeare dataset (Mc Mahan et al., 2017)... |
| Dataset Splits | No | The paper mentions using train and validation data (e.g., 'reduced the amount of train and validation data for each speaker', 'best validation accuracy', 'validation accuracy throughout training'), but it does not specify explicit percentages or counts for the dataset splits, nor does it refer to a standard split with a citation for the overall dataset partitioning (only client data distribution). |
| Hardware Specification | No | The paper mentions 'edge devices' and 'low-resourced clients' in the context of the problem, but it does not provide any specific hardware details such as CPU/GPU models, memory, or cloud instance types used for running its experiments. |
| Software Dependencies | No | The paper states 'We implemented Do Co FL in Py Torch (Paszke et al., 2019)', but does not provide a specific version number for PyTorch or any other software dependency. |
| Experiment Setup | Yes | In all experiments, the PS uses Momentum SGD as optimizer with a momentum of 0.9 and L2 regularization (i.e., weight decay) with parameter 10 5. The clients, on the other hand, use vanilla SGD for all tasks but Amazon Reviews, for which Adam provided better results. In Table 4 we report the hyperparameters used in our experiments. Table 4. Hyperparameters for our experiments. Task Batch size Client optimizer Client lr Server lr EMNIST 64 SGD 0.05 1 CIFAR-100 128 SGD 0.05 1 Amazon Review 64 Adam 0.005 0.1 Shakespeare 4 SGD 0.5 1 |