Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Federated Continual Learning via Orchestrating Multi-Scale Expertise

Authors: Xiaoyang Yi, Yang Liu, Binhan Yang, Jian Jun Zhang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results demonstrate that Multi FCL achieves state-of-the-art performance across multiple datasets and settings, showcasing its effectiveness in FCL scenarios. (Abstract) We compare Multi FCL with other baselines, the results are shown in Table 1 and Table 2. (Section 4.2) We perform the ablation study of Multi FCL on CIFAR100 with 10 Tasks, the results are shown in Table 3. (Section 4.3)
Researcher Affiliation	Academia	1College of Cryptology and Cyber Science, Nankai University, China 2College of Computer Science, Nankai University, China 3Tianjin Key Laboratory of Network and Data Security Technology, Tianjin, China 4Key Laboratory of Data and Intelligent System Security, Ministry of Education, Tianjin, China EMAIL EMAIL
Pseudocode	Yes	Algorithm 1 displays the detailed process in Multi FCL. It employs adapters to fine-tune the PTM and leverages the semantic features of old tasks to initialize new class prototypes. Then, it establishes multiple experts, employing feature learning loss and the multi-teacher dynamic self-distillation to transfer knowledge to the final expert. Algorithm 1 Multi FCL Framework
Open Source Code	Yes	The code is available at https://github.com/yang12318/MultiFCL.
Open Datasets	Yes	We use CIFAR100 (Krizhevsky & Hinton, 2009), Tiny Image Net (mnmoustafa & Ali, 2017), Image Net-R (Hendrycks et al., 2021), and CUB-200-2011 (Wah et al., 2011), which are not learned by CLIP during pre-training.
Dataset Splits	Yes	We divide all datasets into 5 and 10 tasks for incremental learning... (Section 4.1) Following the setting of existing FCL methods, we employ two different data partition strategies, including distribution-based partitioning and quality-based partitioning, where the former fixes the number of samples per class and specifies the number of classes owned by the client through α, while the latter uses β to specify the degree of data heterogeneity based on the Dirichlet distribution. (Section 4.1) Figure 8 shows the data distribution under different datasets and partitioning strategies in the last task, which the ID of clients and labels are only used to distinguish between different clients and labels. (Appendix B)
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory) are provided. The paper only mentions "We evaluate CIL and TIL experiments using CLIP..." and "Specifically, we set up 10 clients..." without specifying the underlying hardware for these operations.
Software Dependencies	No	The paper does not explicitly state specific software versions (e.g., Python, PyTorch, TensorFlow, CUDA versions) used for implementation.
Experiment Setup	Yes	Specifically, we set up 10 clients, with each client owning 4 experts and performing 5 epochs of local training. For non-PTM methods, we use a learning rate of 1e-2 and conduct 100 communication rounds per task. For PTM methods, we use a learning rate of 1e-5 and perform 5 communication rounds per task.