reproducibilityindex.ai

Federated Continual Learning with Weighted Inter-client Transfer

Authors: Jaehong Yoon, Wonyong Jeong, Giwoong Lee, Eunho Yang, Sung Ju Hwang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our Fed We IT against existing federated learning and continual learning methods under varying degrees of task similarity across clients, and our model signiﬁcantly outperforms them with a large reduction in the communication cost.
Researcher Affiliation	Collaboration	1Korea Advanced Institute of Science and Technology (KAIST), South Korea 2AITRICS, South Korea.
Pseudocode	Yes	Algorithm 1 Federated Weighted Inter-client Transfer
Open Source Code	Yes	Code is available at https://github.com/wyjeong/Fed We IT.
Open Datasets	Yes	We validate our Fed We IT under different conﬁgurations of task sequences against baselines which are namely Overlapped-CIFAR-100 and Non IID-50. ... MNIST (Le Cun et al., 1998), CIFAR-10/-100 (Krizhevsky & Hinton, 2009), SVHN (Netzer et al., 2011), Fashion MNIST (Xiao et al., 2017), Not-MNIST (Bulatov, 2011), Face Scrub (Ng & Winkler, 2014), and Trafﬁc Signs (Stallkamp et al., 2011).
Dataset Splits	Yes	Table A.5. Dataset Details of Non IID-50 Task. We provide dataset details of Non IID-50 dataset, including 8 heterogeneous datasets, number of sub-tasks, classes per sub-task, and instances of train, valid, and test sets.
Hardware Specification	No	The paper describes the network architectures used (Le Net, ResNet-18) but does not provide specific hardware details such as GPU/CPU models, processors, or memory used for running the experiments.
Software Dependencies	No	The paper mentions using an Adam optimizer but does not specify software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow versions, CUDA).
Experiment Setup	Yes	We use an Adam optimizer with adaptive learning rate decay, which decays the learning rate by a factor of 3 for every 5 epochs with no consecutive decrease in the validation loss. We stop training in advance and start learning the next task (if available) when the learning rate reaches ρ. The experiment for Le Net with 5 clients, we initialize by 1e 3 1/3 at the beginning of each new task and ρ = 1e 7. Mini-batch size is 100, the rounds per task is 20, an the epoch per round is 1. The setting for Res Net-18 is identical, excluding the initial learning rate, 1e 4. In the case of experiments with 20 and 100 clients, we set the same settings except reducing minibatch size from 100 to 10 with an initial learning rate 1e 4. We use client fraction 0.25 and 0.05, respectively, at each communication round.