DENSE: Data-Free One-Shot Federated Learning
Authors: Jie Zhang, Chen Chen, Bo Li, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chunhua Shen, Chao Wu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on a variety of real-world datasets demonstrate the superiority of our method. For example, DENSE outperforms the best baseline method Fed-ADI by 5.08% on CIFAR10 dataset. |
| Researcher Affiliation | Collaboration | 1Zhejiang University 2Youtu Lab, Tencent 3 Sony AI |
| Pseudocode | Yes | Algorithm 1 Training process of DENSE |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | Our experiments are conducted on the following 6 real-world datasets: MNIST [24], FMNIST [53], SVHN [43], CIFAR10 [21], CIFAR100 [21], and Tiny-Image Net [23]. |
| Dataset Splits | Yes | Tiny-Image Net contains 100000 images of 200 classes (500 for each class) downsized to 64x64 colored images. Each class has 500 training images, 50 validation images and 50 test images. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as GPU models, CPU specifications, or memory amounts. It only mentions general settings like 'on the server' or 'train the auxiliary generator G()' without hardware specifics. |
| Software Dependencies | No | The paper mentions software components like 'SGD optimizer' and 'Adam optimizer' but does not specify any version numbers for programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For clients local training, we use the SGD optimizer with momentum=0.9 and learning rate=0.01. We set the batch size b = 128, the number of local epochs E = 200, and the client number m = 5. Following the setting of [2], we train the auxiliary generator G( ) with a deep convolutional network. We use Adam optimizer with learning rate ηG = 0.001. We set the number of training rounds in each epoch as TG = 30, and set the scaling factor λ1 = 1 and λ2 = 0.5. For the training of the server model f S(), we use the SGD optimizer with learning rate ηS = 0.01 and momentum=0.9. The number of epochs for distillation T = 200. All baseline methods use the same setting as ours. |