Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Data-Free Diversity-Based Ensemble Selection for One-Shot Federated Learning
Authors: Naibo Wang, Wenjie Feng, yuchen deng, Moming Duan, Fusheng Liu, See-Kiong Ng
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our method can achieve both better performance and higher efficiency over 7 datasets, 5 different model structures, and both homogeneous and heterogeneous model groups under four different data-partition strategies. |
| Researcher Affiliation | Academia | 1 Institute of Data Science, National University of Singapore 2 School of Mathematics and Statistics, Changchun University of Technology |
| Pseudocode | Yes | Algorithm 1: De DES framework Algorithm 2: Outlier Filter algorithm for the model filtering |
| Open Source Code | No | The paper does not provide concrete access to source code. It does not contain an explicit statement about code release, a link to a code repository, or mention of code in supplementary materials. |
| Open Datasets | Yes | We used 7 image datasets and 5 types of neural network models in our experiments, details can be found in the Appendix. [...] Figure 3 shows an example of the data distribution under the different partition strategies for CIFAR-10 with 5 parties. [...] EMNIST Digits [...] EMNIST Balanced [...] SVHN [...] FEMNIST [...] CIFAR10 [...] CIFAR100 |
| Dataset Splits | Yes | To simulate the real scenarios in FL (Li et al., 2022) we designed four types of dataset-partition strategies to evaluate De DES, which lead to different local data distribution to train diverse client models Mi. Homogeneous (homo): the amount of samples and the data distribution keep the same for all parties. IID but different quantity (iid-dq): the training data of each party follows the same distribution, but the amount of data is different. Skewed data distribution (noniid-lds): the training data of each party follows different distributions, especially for the label distribution. Non-iid with k (< C) classes (noniid-lk): the training data of each party only contains k of C classes, which is an extreme non-iid setting. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It mentions training models but not on what hardware. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | No | The detailed runtime setups and configuration of De DES are elaborated in the Appendix, including the learning rate, model representation strategy, clustering method for different data partitions, etc. |