Accurate Forgetting for Heterogeneous Federated Continual Learning
Authors: Abudukelimu Wuerkaixi, Sen Cui, Jingfeng Zhang, Kunda Yan, Bo Han, Gang Niu, Lei Fang, Changshui Zhang, Masashi Sugiyama
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments affirm the superiority of our method over baselines. |
| Researcher Affiliation | Collaboration | 1 Institute for Artificial Intelligence, Tsinghua University (THUAI) Beijing National Research Center for Information Science and Technology (BNRist) Department of Automation, Tsinghua University, Beijing, P.R.China 2 The University of Auckland 3 RIKEN 4 Hong Kong Baptist University 5 Data Canvas Technology Co., Ltd. 6 The University of Tokyo |
| Pseudocode | Yes | The algorithm of our method is detailed in Algorighm 1. |
| Open Source Code | Yes | Code is at: https://github.com/zaocan666/AF-FCL. |
| Open Datasets | Yes | For the EMNIST-based dataset containing 26 classes of handwritten letter images (Cohen et al., 2017), we set the following two settings with N=8, T=6, C=2. 1) EMNIST-LTP: in LTP setting, we randomly sampled classes from the entire dataset for each client. 2) EMNIST-shuffle: in conventional shuffle setting, the task sets are consistent across all clients, while arranged in different orders. 3) CIFAR100: We randomly sample 20 classes among 100 classes of CIFAR100 (Krizhevsky et al., 2009) as a task for each of the 10 clients, and there are 4 tasks for each client (N = 10, T = 4, C = 20). 4) MNIST-SVHN-F: We set 10 clients with this mixed dataset. Each client contains 6 tasks, and each task has 3 classes. |
| Dataset Splits | No | The paper states that for EMNIST-noisy, 'After learning sequentially on all tasks, we evaluate the final three tasks, which do not contain any noisy labels,' but it does not provide general or explicit train/validation/test dataset splits (e.g., percentages, sample counts, or a description of a distinct validation set) for all experiments or for model tuning purposes. |
| Hardware Specification | Yes | In the experiments, we conduct all methods on a local Linux server that has two physical CPU chips (Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz) and 32 logical kernels. All methods are implemented using Pytorch framework and all models are trained on Ge Force RTX 2080 Ti GPUs. |
| Software Dependencies | No | The paper states 'All methods are implemented using Pytorch framework' but does not specify the version number of PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For all experiments except for CIFAR100, a learning rate of 1e-4 is utilized, with a global communication round of 60, and local iteration of 100. We set learning rate as 1e-3, global communication round as 40, and local iteration as 400 for CIFAR100. Consistent with prior research (Yoon et al., 2021a; Qi et al., 2023), all clients participate in each communication round. For training, a mini-batch size of 64 is adopted. The Adam optimizer is employed for training all models. |