Achieving Lossless Gradient Sparsification via Mapping to Alternative Space in Federated Learning

Authors: Do-Yeon Kim, Dong-Jun Han, Jun Seo, Jaekyun Moon

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate our proposed mapping approach in terms of communication efficiency in FL setup. ... We essentially follow the settings outlined in previous FL works (Gao et al., 2022; Acar et al., 2021). We investigate a non-IID data setup, and we configure the data split to follow the Dirichlet distribution with parameter values α = 0.6 and α = 0.3.
Researcher Affiliation Collaboration Do-Yeon Kim 1 Dong-Jun Han 2 Jun Seo 3 Jaekyun Moon 1 ... 1Korea Advanced Institute of Science and Technology (KAIST) 2Purdue University 3LG AI Research.
Pseudocode Yes The pseudo code of our approach is provided in Algorithm 1.
Open Source Code No The paper does not provide any explicit statements about releasing source code for the described methodology or links to a code repository.
Open Datasets Yes We evaluate our method in comparison with other baselines on four benchmark datasets: SVHN, CIFAR10, CIFAR100, and Tiny Image Net.
Dataset Splits No The paper mentions: 'We investigate a non-IID data setup, and we configure the data split to follow the Dirichlet distribution with parameter values α = 0.6 and α = 0.3. Data heterogeneity gets more pronounced as α decreases. The dataset size is balanced across all clients.' This describes the data distribution among clients but does not specify the standard training/validation/test splits (e.g., percentages or sample counts) for model training reproducibility.
Hardware Specification No The paper mentions 'on the same GPU device' in Appendix A.11 and provides 'Elapsed time (sec) for mapping construction on the server-side and local training per client per round on the client-side' in Table 3. However, it does not specify any exact GPU/CPU models, processor types, or memory amounts for the hardware used in the experiments.
Software Dependencies No The paper mentions software components like 'SGD optimizer' and 'Pytorch' but does not provide specific version numbers for these, which are necessary for reproducible software dependencies. For example, in Appendix A.7, it states: 'provided by well-known machine learning framework such as Pytorch.'
Experiment Setup Yes We simulate FL training where the number of clients is set to be N = 100, with the contact ratio being 0.1, that is, 10 clients are randomly sampled for communication at each FL round. ... The local epoch is set to be 5 with batch size of 50, and we run FL training for 600 rounds in total. ... Throughout all the simulations, we use SGD optimizer with learning rate 0.1, which is decayed with a rate of 0.998 at each round. The weight decay is set to be 5e-4.