Preservation of the Global Knowledge by Not-True Distillation in Federated Learning

Authors: Gihun Lee, Minchan Jeong, Yongjin Shin, Sangmin Bae, Se-Young Yun

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiments, Fed NTD shows state-of-the-art performance on various setups without compromising data privacy or incurring additional communication costs. We test our algorithm on MNIST [11], CIFAR-10 [25], CIFAR-100 [25], and CINIC-10 [10]. We compare our Fed NTD with various existing works, with results shown in Table 1.
Researcher Affiliation Academia Gihun Lee*, Minchan Jeong*, Yongjin Shin, Sangmin Bae, Se-Young Yun KAIST {opcrisis, mcjeong, yj.shin, bsmn0223, yunseyoung}@kaist.ac.kr
Pseudocode Yes Algorithm 1 Federated Not-True Distillation (Fed NTD)
Open Source Code Yes 1https://github.com/Lee-Gihun/Fed NTD
Open Datasets Yes We test our algorithm on MNIST [11], CIFAR-10 [25], CIFAR-100 [25], and CINIC-10 [10].
Dataset Splits No The paper does not explicitly describe a separate validation dataset split with specific percentages or counts. It primarily refers to training and testing.
Hardware Specification No The provided text does not specify the hardware used for experiments, such as specific GPU models, CPU types, or memory.
Software Dependencies No The paper mentions 'Pytorch' in a reference but does not specify any software dependencies with version numbers (e.g., Python version, PyTorch version, specific libraries with versions) used for their experiments.
Experiment Setup Yes We use a momentum SGD with an initial learning rate of 0.1, and the momentum is set as 0.9. The learning rate is decayed with a factor of 0.99 at each round, and a weight decay of 1e-5 is applied. We adopt two different NIID partition strategies: (i) Sharding [37]: sort the data by label and divide the data into same-sized shards, and control the heterogeneity by s, the number of shards per user. (ii) Latent Dirichlet Allocation (LDA) [34, 46]: assigns partition of class c by sampling pc Dirpαq.