Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Achieving Lossless Gradient Sparsification via Mapping to Alternative Space in Federated Learning
Authors: Do-Yeon Kim, Dong-Jun Han, Jun Seo, Jaekyun Moon
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate our proposed mapping approach in terms of communication efficiency in FL setup. ... We essentially follow the settings outlined in previous FL works (Gao et al., 2022; Acar et al., 2021). We investigate a non-IID data setup, and we configure the data split to follow the Dirichlet distribution with parameter values α = 0.6 and α = 0.3. |
| Researcher Affiliation | Collaboration | Do-Yeon Kim 1 Dong-Jun Han 2 Jun Seo 3 Jaekyun Moon 1 ... 1Korea Advanced Institute of Science and Technology (KAIST) 2Purdue University 3LG AI Research. |
| Pseudocode | Yes | The pseudo code of our approach is provided in Algorithm 1. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the described methodology or links to a code repository. |
| Open Datasets | Yes | We evaluate our method in comparison with other baselines on four benchmark datasets: SVHN, CIFAR10, CIFAR100, and Tiny Image Net. |
| Dataset Splits | No | The paper mentions: 'We investigate a non-IID data setup, and we configure the data split to follow the Dirichlet distribution with parameter values α = 0.6 and α = 0.3. Data heterogeneity gets more pronounced as α decreases. The dataset size is balanced across all clients.' This describes the data distribution among clients but does not specify the standard training/validation/test splits (e.g., percentages or sample counts) for model training reproducibility. |
| Hardware Specification | No | The paper mentions 'on the same GPU device' in Appendix A.11 and provides 'Elapsed time (sec) for mapping construction on the server-side and local training per client per round on the client-side' in Table 3. However, it does not specify any exact GPU/CPU models, processor types, or memory amounts for the hardware used in the experiments. |
| Software Dependencies | No | The paper mentions software components like 'SGD optimizer' and 'Pytorch' but does not provide specific version numbers for these, which are necessary for reproducible software dependencies. For example, in Appendix A.7, it states: 'provided by well-known machine learning framework such as Pytorch.' |
| Experiment Setup | Yes | We simulate FL training where the number of clients is set to be N = 100, with the contact ratio being 0.1, that is, 10 clients are randomly sampled for communication at each FL round. ... The local epoch is set to be 5 with batch size of 50, and we run FL training for 600 rounds in total. ... Throughout all the simulations, we use SGD optimizer with learning rate 0.1, which is decayed with a rate of 0.998 at each round. The weight decay is set to be 5e-4. |