Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
CLIP-Guided Federated Learning on Heterogeneity and Long-Tailed Data
Authors: Jiangming Shi, Shanshan Zheng, Xiangbo Yin, Yang Lu, Yuan Xie, Yanyun Qu
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on several benchmarks demonstrate that CLIP2FL achieves impressive performance and effectively deals with data heterogeneity and long-tail distribution. The code is available at https://github.com/shijiangming1/CLIP2FL. |
| Researcher Affiliation | Academia | Jiangming Shi1, Shanshan Zheng2, Xiangbo Yin2, Yang Lu2, Yuan Xie3, 4 , Yanyun Qu1, 2* 1 Institute of Artificial Intelligence, Xiamen University 2 School of Informatics, Xiamen University 3 East China Normal University 4 Chongqing Institute of East China Normal University |
| Pseudocode | Yes | Algorithm 1: Training Process for Round t |
| Open Source Code | Yes | The code is available at https://github.com/shijiangming1/CLIP2FL. |
| Open Datasets | Yes | Datasets. We implement CLIP2FL on three frequently used datasets with the long-tailed data: CIFAR-10/100LT (Krizhevsky, Hinton et al. 2009) and Image Net-LT (Russakovsky et al. 2015). |
| Dataset Splits | No | No explicit mention of specific train/validation/test dataset splits (e.g., percentages, sample counts) for the overall datasets was found. The paper describes data partitioning among clients and long-tailed distribution generation. |
| Hardware Specification | Yes | Experiments were conducted using Py Torch on four NVIDIA Ge Force RTX 3090 GPUs. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with their versions. |
| Experiment Setup | Yes | The number of clients is set to 20, and 40% of them are randomly selected as online clients to participate in training. The batch size of client-side training is set to 32 for all datasets and we set the number of federated features to 100 for each class. ...We employed the standard cross-entropy loss by default and executed 200 communication rounds. ... Three important hyperparameters in our CLIP2FL are β, η and m. We found that CLIP2FL achieved the best performance when β = 3.0, η {0.001, 0.0001, 1e 5} and m = 100. |