reproducibilityindex.ai

Federated Behavioural Planes: Explaining the Evolution of Client Behaviour in Federated Learning

Authors: Dario Fenoglio, Gabriele Dominici, Pietro Barbiero, Alberto Tonda, Martin Gjoreski, Marc Langheinrich

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that FBPs provide informative trajectories describing the evolving states of clients and their contributions to the global model, thereby enabling the identification of clusters of clients with similar behaviours.
Researcher Affiliation	Academia	Dario Fenoglio Università della Svizzera italiana Lugano, Switzerland dario.fenoglio@usi.ch Gabriele Dominici Università della Svizzera italiana Lugano, Switzerland gabriele.dominici@usi.ch Pietro Barbiero Università della Svizzera italiana Lugano, Switzerland pietro.barbiero@usi.ch Alberto Tonda INRAE Paris, France alberto.tonda@inrae.fr Martin Gjoreski Università della Svizzera italiana Lugano, Switzerland martin.gjoreski@usi.ch Marc Langheinrich Università della Svizzera italiana Lugano, Switzerland marc.langheinrich@usi.ch
Pseudocode	Yes	Additionally, we provide pseudo-code for both client-side (Algorithm 2) and server-side (Algorithm 1) implementations of our proposed approach, which includes creating behavioural planes on the server and applying our FBSs.
Open Source Code	Yes	Our code is publicly available on Git Hub1. 1https://github.com/dariofenoglio98/CF_FL
Open Datasets	Yes	In our experiments, we utilise four datasets: a Synthetic dataset (tabular) we designed to have full control on clients data distributions, and thus test our assumptions; the Breast Cancer Wisconsin [36] (tabular); the Diabetes Health Indicator [37] (tabular); small-MNIST [38] (image); and small CIFAR-10 [39] (image), reducing its size by 76% to increase task difficulty and highlight client differences in performance.
Dataset Splits	Yes	Additionally, we partition each client s dataset locally, allocating 80% for training and 20% for validation.
Hardware Specification	Yes	All experiments were conducted on a workstation equipped with an NVIDIA RTX A6000 GPU, two AMD EPYC 7513 32-Core processors, and 512 GB of RAM.
Software Dependencies	Yes	For our experiments, we implement all baselines and methods in Python 3.9 and relied upon open-source libraries such as Py Torch 2.2 [67] (BSD license), Sklearn 1.4 [68] (BSD license), Flower 1.6 [66] (Apache License). In addition, we used Matplotlib [69] 3.8.2 (BSD license) and Seaborn [70] 0.13 (BSD license) to produce the plots shown in this paper. Data processing is performed using Pandas [71] 2.2 (BSD license).
Experiment Setup	Yes	This section describes essential information about the experiments. Further details on model configuration, training setup, and computational cost are presented in Appendices A.2, A.4, and A.6, respectively. Gradient Descent was employed as the optimisation algorithm, with a batch size equivalent to the dimension of the training dataset. Both the momentum and learning rate were set at 0.9 and 0.01, respectively. For centralised training scenarios, the model was trained over 1,000 epochs. The Flower library was utilised to implement FL [66]. In all federated experiments, except for those evaluating local epochs, we employed 2 local epochs. During the assessment of various defense mechanisms, the number of communication rounds was capped at 200 and the window length for the moving average of 30 rounds.