Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient Split-Mix Federated Learning for On-Demand and In-Situ Customization
Authors: Junyuan Hong, Haotao Wang, Zhangyang Wang, Jiayu Zhou
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our method provides better in-situ customization than the existing heterogeneous-architecture FL methods. Codes and pre-trained models are available: https://github.com/illidanlab/SplitMix. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, Michigan State University 2Department of Electrical and Computer Engineering, University of Texas at Austin |
| Pseudocode | Yes | Algorithm 1 Federated Split-Mix Learning, Algorithm 2 Sample Base Models(P, p, n, W), Algorithm 3 Local Train(Wk, Dk, E, ฮท), Algorithm 4 Local Train(Wk, Dk, E, ฮท) with DBN and adversarial training |
| Open Source Code | Yes | Codes and pre-trained models are available: https://github.com/illidanlab/SplitMix. |
| Open Datasets | Yes | We use CIFAR10 dataset (Krizhevsky, 2009) with preactivated ResNet (Pre ResNet18) (He et al., 2016). The CIFAR10 data are uniformly split into 100 clients and distribute 3 classes per client. For (feature) non-i.i.d. configuration, we use Digits with a CNN defined and Domain Net datasets (Li et al., 2020b) with AlexNet extended with BN layers after each convolutional or linear layer (Li et al., 2020b). The first dataset is a subset (30%) of Digits, a benchmark for domain adaption (Peng et al., 2019b). The second dataset is Domain Net (Peng et al., 2019a) processed by (Li et al., 2020b), which contains 6 distinct domains of large-size 256x256 real-world images. |
| Dataset Splits | Yes | The CIFAR10 data are uniformly split into 100 clients and distribute 3 classes per client. Each domain of Digits (or Domain Net) are split into 10 (or 5) clients, and therefore 50 (or 30) clients in total. We use an n-step projected gradient descent (PGD) attack (Madry et al., 2018) with a constant noise magnitude ฯต. Following (Madry et al., 2018), we set (ฯต, n) = (8/255, 7), and attack inner-loop step size 2/255, for training, validation, and test. |
| Hardware Specification | Yes | We implement all algorithms in PyTorch 1.4.1 run on a single NVIDIA RTX A5000 GPU and a 104-thread CPU. |
| Software Dependencies | Yes | We implement all algorithms in PyTorch 1.4.1 run on a single NVIDIA RTX A5000 GPU and a 104-thread CPU. |
| Experiment Setup | Yes | In general, for local optimization we use stochastic gradient descent (SGD) with 0.9 momentum and 5 ยท 10โ4 weight decay. CIFAR10: Following Hetero FL (Diao et al., 2021), we train with 5 local epochs and 400 global communication rounds. Globally, we initialize the learning rate as 0.01 and adjust the learning rate at 150, 250 communication rounds with a scale rate of 0.1. Locally, we use a larger batch size of 128, to speed up the training in simulation. Digits: We use a cosine annealing learning rate decaying from 0.1 to 0 across 400 global communication rounds. SGD is executed with one epoch for each local client. Domain Net: We use a constant learning rate 0.01 and run 400 communication rounds in total. Similar to Digits, SGD is executed with one epoch for each local client. |