reproducibilityindex.ai

Preservation of the Global Knowledge by Not-True Distillation in Federated Learning

Authors: Gihun Lee, Minchan Jeong, Yongjin Shin, Sangmin Bae, Se-Young Yun

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the experiments, Fed NTD shows state-of-the-art performance on various setups without compromising data privacy or incurring additional communication costs. We test our algorithm on MNIST [11], CIFAR-10 [25], CIFAR-100 [25], and CINIC-10 [10]. We compare our Fed NTD with various existing works, with results shown in Table 1.
Researcher Affiliation	Academia	Gihun Lee, Minchan Jeong, Yongjin Shin, Sangmin Bae, Se-Young Yun KAIST {opcrisis, mcjeong, yj.shin, bsmn0223, yunseyoung}@kaist.ac.kr
Pseudocode	Yes	Algorithm 1 Federated Not-True Distillation (Fed NTD)
Open Source Code	Yes	1https://github.com/Lee-Gihun/Fed NTD
Open Datasets	Yes	We test our algorithm on MNIST [11], CIFAR-10 [25], CIFAR-100 [25], and CINIC-10 [10].
Dataset Splits	No	The paper does not explicitly describe a separate validation dataset split with specific percentages or counts. It primarily refers to training and testing.
Hardware Specification	No	The provided text does not specify the hardware used for experiments, such as specific GPU models, CPU types, or memory.
Software Dependencies	No	The paper mentions 'Pytorch' in a reference but does not specify any software dependencies with version numbers (e.g., Python version, PyTorch version, specific libraries with versions) used for their experiments.
Experiment Setup	Yes	We use a momentum SGD with an initial learning rate of 0.1, and the momentum is set as 0.9. The learning rate is decayed with a factor of 0.99 at each round, and a weight decay of 1e-5 is applied. We adopt two different NIID partition strategies: (i) Sharding [37]: sort the data by label and divide the data into same-sized shards, and control the heterogeneity by s, the number of shards per user. (ii) Latent Dirichlet Allocation (LDA) [34, 46]: assigns partition of class c by sampling pc Dirpαq.