Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Class-wise Balancing Data Replay for Federated Class-Incremental Learning
Authors: Zhuang Qi, Ying-Peng Tang, Lei Meng, Han Yu, Xiaoxiao Li, Xiangxu Meng
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments were conducted on three datasets with different levels of heterogeneity, including performance comparisons, ablation studies, in-depth analysis, and case studies. The results demonstrate that Fed CBDR effectively balances the number of replayed samples across classes and alleviates the long-tail problem. Compared to six state-of-the-art existing methods, Fed CBDR achieves a 2%-15% Top-1 accuracy improvement. |
| Researcher Affiliation | Academia | 1School of Software, Shandong University, China 2College of Computing and Data Science, Nanyang Technological University, Singapore 3Department of Electrical and Computer Engineering, University of British Columbia, Canada 4 Vector Institute, Canada EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 FEDCBDR |
| Open Source Code | Yes | The code will be made available as supplementary material. |
| Open Datasets | Yes | Following existing studies [27, 30], we conducted all experiments on three commonly used datasets, including CIFAR10 [53, 54], CIFAR100 [53, 54] and Tiny Image Net [55] to validate the effectiveness of the Fed CBDR. |
| Dataset Splits | Yes | We simulate heterogeneous data distributions across clients using the Dirichlet distribution with parameters β = {0.1, 0.5, 1.0}, where smaller values of β correspond to higher level of data heterogeneity. The statistical details are presented in the Table 1. ... The number of stored samples per task varies by dataset and split setting: for CIFAR10, 450 samples are stored under 3-task splits and 300 under 5-task splits; for CIFAR100, 1,000 samples are used for 5-task splits and 500 for 10-task splits; for Tiny Image Net, 2,000 samples are stored for 10-task splits and 1,000 for 20-task splits. |
| Hardware Specification | Yes | And training on each client is performed using an NVIDIA RTX 3090 GPU (24 GB). |
| Software Dependencies | No | The paper mentions 'Res Net-18 as the backbone' and 'SGD optimizer' but does not provide specific version numbers for any software libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | In the experiments, the number of clients is fixed at K = 5, with each client running local epochs E = 2 per round, using a batch size B = 128. For all datasets, we adopt Res Net-18 as the backbone, with the classifier s output dimension dynamically updated as tasks progress and conduct T = 100 communication rounds per task. The SGD optimizer is employed with a learning rate of 0.01 and a weight decay of 1 10 5. The number of stored samples per task varies by dataset and split setting: for CIFAR10, 450 samples are stored under 3-task splits and 300 under 5-task splits; for CIFAR100, 1,000 samples are used for 5-task splits and 500 for 10-task splits; for Tiny Image Net, 2,000 samples are stored for 10-task splits and 1,000 for 20-task splits. For the temperature and weighted parameters, we select τold {0.8, 0.9} and wold {1.1, 1.2, 1.3, 1.4} for previous tasks, while τnew {1.1, 1.2} and wnew {0.7, 0.8, 0.9} are used for newly arrived tasks. |