FedSoft: Soft Clustered Federated Learning with Proximal Local Updating
Authors: Yichen Ruan, Carlee Joe-Wong8124-8131
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we verify the effectiveness of Fed Soft with two base datasets under various mixture patterns. For all experiments, we use N = 100 clients, and the number of samples in each client nk is chosen uniformly at random from 100 to 200. For ease of demonstration, for every base dataset, we first investigate the mixture of S = 2 distributions and then increase S. and Table 2 compares Fed Soft with the baselines. Not only does Fed Soft produce more accurate cluster and local models, but it also achieves better balance between the two trained centers. |
| Researcher Affiliation | Academia | Yichen Ruan, Carlee Joe-Wong Carnegie Mellon University yichenr@andrew.cmu.edu, cjoewong@andrew.cmu.edu |
| Pseudocode | Yes | Algorithm 1: Fed Soft |
| Open Source Code | No | The paper does not contain an explicit statement about releasing its source code or a link to a code repository. |
| Open Datasets | Yes | We use three datasets to generate the various distributions: Synthetic, EMNIST and CIFAR-10. |
| Dataset Splits | No | The paper states it uses 'test accuracy/error on holdout datasets' and 'evaluate their accuracy/error on local training datasets' but does not provide specific percentages or counts for train/validation/test splits of the datasets used (Synthetic, EMNIST, CIFAR-10). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | Unless otherwise noted, we choose Fed Soft s estimation interval τ = 2, client selection size K = 60, counter smoother σ = 1e-4, and all experiments are run until both cluster and client models have fully converged. All models are randomly initialized with the Xavier normal (Glorot and Bengio 2010) initializer without pre-training, so that the association among clients, centers, and cluster distributions is built automatically during the training process. |