Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering

Authors: Ekdeep Lubana, Chi Ian Tang, Fahim Kawsar, Robert Dick, Akhil Mathur

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We extensively compare Orchestra with several federated versions of centralized SSL techniques. We show, unlike Orchestra, direct extension techniques are often sensitive to several important FL parameters (e.g., participation ratio, local epochs).
Researcher Affiliation Collaboration 1EECS Department, University of Michigan, Ann Arbor, USA 2University of Cambridge, UK 3Nokia Bell Labs, Cambridge, UK.
Pseudocode Yes Algorithm: Below we provide a detailed algorithm of our pipeline, as described in 4 and outlined in Figure 3. Orchestra involves federation and communication between clients and the server, and Algorithm 1 describes local training happening on the clients, while Algorithm 2 outlines the computation happening on the server.
Open Source Code Yes A working source code of Orchestra is provided available at following github link.
Open Datasets Yes Our experiments focus on the widely-used CIFAR-10 and CIFAR-100 datasets (Krizhevsky & Hinton, 2009).
Dataset Splits Yes Both datasets consist of 60,000 images, divided into two partitions: 50,000 for training and 10,000 for testing.
Hardware Specification Yes Our primary results in cross-device FL settings are presented for K = 100 clients, which we simulate on 8 NVIDIA V100 GPUs... We plot time consumed by a client running other methods, relative to when it runs Orchestra, and percent time consumed by local clustering in a round of Orchestra. As can be seen, Orchestra’s latency per round is similar to other methods, and clustering accounts for 0.009% of the training cost, confirming Orchestra’s practicality for cross-device FL.
Software Dependencies No Orchestra is implemented using Py Torch and the Flower federated learning framework (Beutel et al., 2020). While these software components are mentioned, specific version numbers are not provided.
Experiment Setup Yes Consistent with the paradigm of cross-device FL, we use a small batch size of 16 on each client, set the number of local epochs E to 10, communication rounds to 100, and participation ratio to 0.5. For cross-silo experiments, a batch size of 256 and participation ratio of 1.0 is employed. ...We denote learning rate using η and EMA value using m.