Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Unified Breakdown Analysis for Byzantine Robust Gossip
Authors: Renaud Gaucher, Aymeric Dieuleveut, Hadrien Hendrikx
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We give experimental evidence to validate the effectiveness of CS+ RG and highlight the gap with NNA, in particular against a novel attack tailored to decentralized communications. ... Section 6. Experimental evaluation. We follow Farhadkhani et al. (2023) (on which the core of our code is based), and present results for classification tasks on MNIST and CIFAR-10 datasets, as well as plain averaging tasks. ... In Figure 1, it appears that the Sp H attack is more efficient in disrupting Clipped Gossip, GTS RG and IOS than Dissensus and ALIE, and that CS+ RG is highly resilient in the setup considered. |
| Researcher Affiliation | Academia | 1Centre de math ematiques appliqu ees, Ecole polytechnique, Institut Polytechnique de Paris, Palaiseau France 2Centre Inria de l Univ. Grenoble Alpes, CNRS, LJK, Grenoble, France. Correspondence to: Renaud Gaucher <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Byzantine-Resilient D-SGD with F RG |
| Open Source Code | Yes | See Appendix B for a detailed experimental setup and our implementation available at https://github.com/renaudgaucher/Byzantine-Robust-Gossip. |
| Open Datasets | Yes | We follow Farhadkhani et al. (2023) (on which the core of our code is based), and present results for classification tasks on MNIST and CIFAR-10 datasets, as well as plain averaging tasks. |
| Dataset Splits | No | The paper uses well-known datasets (MNIST, CIFAR-10) but does not explicitly state the training/test/validation splits used, nor does it refer to specific standard splits with citations within the text. It describes data heterogeneity and preprocessing, but not the partitioning into subsets for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. It describes the experimental setup in terms of software and dataset usage but omits hardware specifications. |
| Software Dependencies | No | The paper states, 'Our experimental setting is built on top of the code provided by Farhadkhani et al. (2023),' indicating a dependency. However, it does not provide specific version numbers for any software libraries, frameworks, or programming languages used (e.g., Python 3.x, PyTorch 1.x, CUDA). |
| Experiment Setup | Yes | The architecture of the model used and the experimental setup are proposed in Table 1. Table 1. Detailed experimental setting Dataset MNIST CIFAR-10 Model type CNN CNN Batch size 64 64 Learning rate ηop = 0.1 ηop = 0.5 Momentum β = 0.9 β = 0.99 Number of Iterations T = 300 T = 5000 |