COALA: A Practical and Vision-Centric Federated Learning Platform

Authors: Weiming Zhuang, Jian Xu, Chen Chen, Jingtao Li, Lingjuan Lyu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct systematic benchmarking experiments for the practical FL scenarios and highlight potential opportunities for further advancements in FL.
Researcher Affiliation Collaboration 1Sony AI 2Tsinghua University.
Pseudocode No The paper describes system workflows and customization steps in prose and provides code snippets for user customization examples, but it does not present any formal pseudocode or clearly labeled algorithm blocks for its core methodology.
Open Source Code No The paper mentions COALA is a platform but does not provide a specific link or explicit statement indicating that the source code for COALA or the methodology described is publicly available.
Open Datasets Yes We use the CIFAR-10/100 datasets for evaluation and the amount of labeled samples for each class is set as 400 for the labelin-server scenarios. For the label-in-client scenarios, we choose 10 labeled samples per class for each client.
Dataset Splits Yes In our benchmark experiments, we use the default training-test data split supported in the platform as described in Table 10. Basically, we make use of all the available training data, except the Digits5 and Domain Net, where we make some sampling to select only an identical amount of training data across different domains as in (Li et al., 2020b).
Hardware Specification Yes All experiments are run on a AWS cloud server equipped with four V100 GPUs.
Software Dependencies No The paper mentions software components like Py Torch, Tensor Flow, g RPC, Protobuf, MQTT, wandb, and tensorboard but does not provide specific version numbers for any of them.
Experiment Setup Yes For local training, SGD is selected as the default local optimizer with mini-batch size 32, learning rate 0.01, momentum 0.9, weight decay 0.0005 and local epoch E = 5 unless otherwise mentioned. The number of communication rounds is set to 100 for three multi-domain datasets and 200 for other datasets.