Practical One-Shot Federated Learning for Cross-Silo Setting

Authors: Qinbin Li, Bingsheng He, Dawn Song

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on various tasks show that Fed KT can significantly outperform the other state-of-the-art federated learning algorithms with a single communication round.
Researcher Affiliation Academia Qinbin Li1 , Bingsheng He1 , Dawn Song2 1National University of Singapore 2University of California, Berkeley {qinbin, hebs}@comp.nus.edu.sg, dawnsong@cs.berkeley.edu
Pseudocode Yes Algorithm 1: The Fed KT algorithm
Open Source Code Yes The code is publicly available 2. 2https://github.com/Qinbin Li/Fed KT
Open Datasets Yes To evaluate Fed KT, we conduct experiments on four public datasets: (1) A random forest on Adult dataset. (2) A gradient boosting decision tree (GBDT) model on cod-rna dataset. (3) A multilayer perceptron (MLP) with two hidden layers on MNIST dataset. (4) A CNN on extended SVHN dataset.
Dataset Splits No For the first two datasets, we split the original dataset at random into train/test/public sets with a 75%/12.5%/12.5% proportion. For MNIST and SVHN, we use one half of the original test dataset as the public dataset and the remaining as the final test dataset. The paper mentions a 'public' dataset which is used in the training process of the student and final models via knowledge transfer, but not a distinct 'validation' set for hyperparameter tuning in the traditional sense.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not explicitly state specific software dependencies with version numbers.
Experiment Setup Yes The number of trees is set to 100 and the maximum tree depth is set to 6. (2) A gradient boosting decision tree (GBDT) model on cod-rna dataset. The maximum tree depth is set to 6. (3) A multilayer perceptron (MLP) with two hidden layers on MNIST dataset. Each hidden layer has 100 units using Re Lu activations. (4) A CNN on extended SVHN dataset. The CNN has two 5x5 convolution layers followed with 2x2 max pooling (the first with 6 channels and the second with 16 channels), two fully connected layers with Re Lu activation (the first with 120 units and the second with 84 units), and a final softmax output layer. By default, we set the number of parties to 50 for Adult and cod-rna and to 10 for MNIST and SVHN. We set s to 2 and t to 5 by default for all datasets.