Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Covariances for Free: Exploiting Mean Distributions for Training-free Federated Learning

Authors: Dipam Goswami, Simone Magistri, Kai Wang, Bartłomiej Twardowski, Andrew D. Bagdanov, Joost van de Weijer

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate Fed COF across multiple FL benchmarks, including the non-iid i Naturalist Users-120K dataset, achieving state-of-the-art results with lower communication cost than methods using second-order statistics, and showing superior performance to recent federated prompt-tuning approaches while also serving as an effective initialization for subsequent federated optimization methods such as fine-tuning and linear probing. 5 Experiments Datasets. We evaluate Fed COF on multiple datasets namely CIFAR-100 [28], Image Net-R [16] (IN-R), CUB200 [50], Stanford Cars [27] and i Naturalist [48].
Researcher Affiliation	Academia	1Department of Computer Science, Universitat Autònoma de Barcelona, Spain 2Computer Vision Center, Barcelona, Spain 3Media Integration and Communication Center (MICC), University of Florence, Italy 4IDEAS Research Institute, Warsaw, Poland 5City University of Hong Kong 6Program of Computer Science, City University of Hong Kong (Dongguan) EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 Fed COF: Federated Learning with COvariances for Free
Open Source Code	Yes	Code is available at https: //github.com/dipamgoswami/Fed COF.
Open Datasets	Yes	Datasets. We evaluate Fed COF on multiple datasets namely CIFAR-100 [28], Image Net-R [16] (IN-R), CUB200 [50], Stanford Cars [27] and i Naturalist [48].
Dataset Splits	Yes	We distribute the first 4 datasets to 100 clients using a highly heterogeneous Dirichlet distribution (α = 0.1) following standard practice [17, 29]. We also use real-world non-iid FL benchmark of i Naturalist-Users-120K [18] (i Nat-120K) having 1203 classes across 9275 clients. We discuss the dataset details in Appendix J. ... CIFAR-100 has 100 classes provided in 50k training and 10k testing images. ... CUB200 is a fine-grained dataset and has 200 classes of different bird species provided in 5994 training and 5794 testing images. ... Stanford Cars has 196 classes of cars with 8144 training images and 8041 test images.
Hardware Specification	Yes	We use one Nvidia RTX 6000 GPU for all our experiments.
Software Dependencies	No	We use the FLSim library. We use γ = 1 for all experiments with Squeeze Net and Vi T-B/16, and γ = 0.1 for all experiments with Mobile Net V2 due to very high dimensionality d of the feature space.
Experiment Setup	Yes	We use γ = 1 for all experiments with Squeeze Net and Vi T-B/16, and γ = 0.1 for all experiments with Mobile Net V2 due to very high dimensionality d of the feature space. We compare to Fed COF Oracle in which real class covariances are shared from clients and aggregated in server instead of using our estimated covariances (see Appendix J). For all experiments, we set the client participation in each round to 30%, and we show the training-free methods in multiple rounds in Figures 3 and 4. We provide more implementation details in Appendix J. We discuss computation of communication costs for all methods in Appendix H.