DepthFL : Depthwise Federated Learning for Heterogeneous Clients

Authors: Minjae Kim, Sangyoon Yu, Suhyun Kim, Soo-Mook Moon

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that depth-scaled local models build a global model better than width-scaled ones, and that self-distillation is highly effective in training data-insufficient deep layers.
Researcher Affiliation Academia Minjae Kim Seoul National University mjkim@snu.ac.kr Sangyoon Yu Seoul National University sangyoonyu@snu.ac.kr Korea Institute of Science and Technology dr.suhyun.kim@gmail.com Soo-Mook Moon Seoul National University smoon@snu.ac.kr
Pseudocode Yes Algorithm 1: Depth FL
Open Source Code No The paper does not contain any statement about releasing source code or a link to a code repository.
Open Datasets Yes We used MNIST, CIFAR-100, and Tiny Image Net datasets for the image classification task, and Wiki Text-2 dataset for the masked language modeling task.
Dataset Splits No The paper mentions the datasets used (MNIST, CIFAR-100, Tiny Image Net, Wiki Text-2) but does not explicitly state the training, validation, and test dataset splits or cite where these splits are defined for reproduction.
Hardware Specification No The paper does not specify the hardware (e.g., GPU models, CPU models, or cloud instances) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, or CUDA versions) required for replication.
Experiment Setup Yes Table 13: Hyperparameters and model architecture used in experiments, provides specific values for Local Epoch E, Local Batch Size B, Optimizer (SGD), Momentum, Weight decay, Temperature, alpha (Fed Dyn), Consistency rampup, Communication rounds, Learning rate, Learning rate decay, Embedding Size, Number of heads, Dropout, and Sequence length.