reproducibilityindex.ai

Federated Submodel Optimization for Hot and Cold Data Features

Authors: Yucheng Ding, Chaoyue Niu, Fan Wu, Shaojie Tang, Chengfei Lyu, yanghe feng, Guihai Chen

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We ﬁnally evaluate Fed Sub Avg over several public and industrial datasets. The evaluation results demonstrate that Fed Sub Avg signiﬁcantly outperforms Fed Avg and its variants.
Researcher Affiliation	Collaboration	Yucheng Ding1 Chaoyue Niu1 Fan Wu1 Shaojie Tang2 Chengfei Lv3 Yanghe Feng4 Guihai Chen1 1Shanghai Jiao Tong University 2University of Texas at Dallas 3Alibaba Group 4National University of Defense Technology
Pseudocode	Yes	Algorithm 1 Federated Submodel Averaging (Fed Sub Avg)
Open Source Code	Yes	The code is available on https://github.com/sjtu-yc/federated-submodel-averaging.
Open Datasets	Yes	Using the public Movie Lens, Sentiment140, and Amazon datasets, as well as an industrial dataset from Alibaba, we extensively evaluate Fed Sub Avg2 and compare it with Fed Avg, Fed Prox, Scaffold, and Fed Adam.
Dataset Splits	No	We randomly select 20% of the samples as the test dataset and leave the remaining 80% as the training dataset for FL. The paper does not explicitly provide details on a validation dataset split.
Hardware Specification	Yes	All experiments were run on a server with 8 NVIDIA 2080Ti GPUs.
Software Dependencies	No	The paper mentions 'mini-batch SGD' and 'Adam optimizer' but does not specify software dependencies like programming languages, libraries, or frameworks with version numbers.
Experiment Setup	Yes	For the tasks of rating classiﬁcation and sentiment analysis, K = 50 clients are randomly chosen per round as default; and for the CTR prediction tasks, K is set to 100 as default. ... For all the datasets, the batch size for each client is set to 16. The number of local epochs is set to 1 for all the algorithms. We use the Adam optimizer for all the local training process with = 0.9 and = 0.999. The learning rate for each task is tuned using grid search over {0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1}.