Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
DFML: Decentralized Federated Mutual Learning
Authors: Yasser H. Khalil, Amir Hossein Estiri, Mahdi Beitollahi, Nader Asadi, Sobhan Hemati, Xu Li, Guojun Zhang, Xi Chen
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate consistent effectiveness of DFML in both convergence speed and global accuracy, outperforming prevalent baselines under various conditions. For example, with the CIFAR-100 dataset and 50 clients, DFML achieves a substantial increase of +17.20% and +19.95% in global accuracy under Independent and Identically Distributed (IID) and non-IID data shifts, respectively. |
| Researcher Affiliation | Industry | Yasser H. Khalil EMAIL Huawei Noah s Ark Lab, Montreal, Canada. Amir H. Estiri EMAIL Huawei Noah s Ark Lab, Montreal, Canada. Mahdi Beitollahi EMAIL Huawei Noah s Ark Lab, Montreal, Canada. Nader Asadi EMAIL Huawei Noah s Ark Lab, Montreal, Canada. Sobhan Hemati EMAIL Huawei Noah s Ark Lab, Montreal, Canada. Xu Li EMAIL Huawei Technologies Canada Inc., Ottawa, Canada. Guojun Zhang EMAIL Huawei Noah s Ark Lab, Montreal, Canada. Xi Chen EMAIL Huawei Noah s Ark Lab, Montreal, Canada. |
| Pseudocode | Yes | Algorithm 1 DFML |
| Open Source Code | No | The paper does not provide an explicit statement about the release of source code or a link to a code repository. |
| Open Datasets | Yes | We evaluate our proposed DFML against prevalent baselines using five datasets including CIFAR-10/100, FMNIST, Caltech101, Oxford Pets, and Stanford Cars. |
| Dataset Splits | Yes | Each split is further segmented into training and validation sets following an 80:20 ratio. For Caltech101, samples are first split 80:20, where the 20% represents the global test set, and the remaining samples follows the defined splitting strategy above. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like "SGD optimizer" and "cosine annealing" but does not provide specific version numbers for any key software libraries or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We utilize SGD optimizer for each client with momentum 0.9 and weight decay 5e-4. The learning rate is selected from {0.1, 0.01, 0.001}. The batch size is set to 8 for the Efficient Net experiments, 16 for the Res Net experiments using Caltech101, Oxford Pets, and Stanford Cars datasets, and batch size of 64 is used for all other experiments,. For the cyclic α scheduler, we apply cosine annealing. The initial oscillating period is set to 10 and is incrementally increased after each completion. α is oscillated from 0 to a maximum value selected between {0.8, 0.9, 1.0}. Figure 11 illustrates an example of the behavior of α throughout training. The number of mutual learning epochs K, performed at the aggregator, is set to 10. Moreover, the temperature is configured to 1. All experiments are repeated for 3 trials with random seeds. |