Tolerating Outliers: Gradient-Based Penalties for Byzantine Robustness and Inclusion

Authors: Latifa Errami, El Houcine Bergou

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical analysis further validates the viability of the proposed approach. Across mild to strong non-IID data splits, our method consistently either matches or surpasses the performance of current approaches in the literature, under state-of-the-art Byzantine attack scenarios.In this section we evaluate the introduced LS variants both on IID and non-IID data when a proportion of the clients is Byzantine. We give a detailed description of the experimental set up: the used data sets, the techniques deployed for non IID data partitioning, the models architectures alongside all the relevant hyper-parameters. Finally, we present Top-1 Test Accuracy as an evaluation metric.
Researcher Affiliation Academia Latifa Errami , El Houcine Bergou College of Computing, Mohammed VI Polytechnic University, Ben Guerir, Morocco latifa.errami@um6p.ma , elhoucine.bergou@um6p.ma
Pseudocode Yes Algorithm 1 F LS
Open Source Code No The paper does not provide any concrete statement or link regarding the release of its source code for the described methodology.
Open Datasets Yes We study the following datasets: MNIST [Le Cun et al., 1998], FMNIST [Xiao et al., 2017], SVHN [Netzer et al., 2011] and CIFAR10 [Krizhevsky and Hinton, 2009].
Dataset Splits No The paper describes how datasets are partitioned for IID and non-IID scenarios (e.g., using Latent Dirichlet Sampling with β values) but does not provide specific percentages or counts for train, validation, and test splits used in their experiments.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or types of computing instances used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with versions) needed to replicate the experiment.
Experiment Setup Yes Specifically, when αb is small enough compared to αt, the influence of σI becomes negligible. Empirical observations, as illustrated in Figure 1, demonstrate good performance even when αb is equal to αt.In fig. 1 We investigate the impact of the choice of the hyper-parameters αb in the performance of our approach. Clearly, αt = αb = 1 is the best choice overall as this strategy keeps the output of the standard aggregation and penalizes the dropped contributions based on their deviation from the output of F with no further penalty.