reproducibilityindex.ai

DENSE: Data-Free One-Shot Federated Learning

Authors: Jie Zhang, Chen Chen, Bo Li, Lingjuan Lyu, Shuang Wu, Shouhong Ding, Chunhua Shen, Chao Wu

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on a variety of real-world datasets demonstrate the superiority of our method. For example, DENSE outperforms the best baseline method Fed-ADI by 5.08% on CIFAR10 dataset.
Researcher Affiliation	Collaboration	1Zhejiang University 2Youtu Lab, Tencent 3 Sony AI
Pseudocode	Yes	Algorithm 1 Training process of DENSE
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets	Yes	Our experiments are conducted on the following 6 real-world datasets: MNIST [24], FMNIST [53], SVHN [43], CIFAR10 [21], CIFAR100 [21], and Tiny-Image Net [23].
Dataset Splits	Yes	Tiny-Image Net contains 100000 images of 200 classes (500 for each class) downsized to 64x64 colored images. Each class has 500 training images, 50 validation images and 50 test images.
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU models, CPU specifications, or memory amounts. It only mentions general settings like 'on the server' or 'train the auxiliary generator G()' without hardware specifics.
Software Dependencies	No	The paper mentions software components like 'SGD optimizer' and 'Adam optimizer' but does not specify any version numbers for programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For clients local training, we use the SGD optimizer with momentum=0.9 and learning rate=0.01. We set the batch size b = 128, the number of local epochs E = 200, and the client number m = 5. Following the setting of [2], we train the auxiliary generator G( ) with a deep convolutional network. We use Adam optimizer with learning rate ηG = 0.001. We set the number of training rounds in each epoch as TG = 30, and set the scaling factor λ1 = 1 and λ2 = 0.5. For the training of the server model f S(), we use the SGD optimizer with learning rate ηS = 0.01 and momentum=0.9. The number of epochs for distillation T = 200. All baseline methods use the same setting as ours.