Byzantine-Tolerant Methods for Distributed Variational Inequalities

Authors: Nazarii Tupitsa, Abdulla Jasem Almansoori, Yanlin Wu, Martin Takac, Karthik Nandakumar, Samuel Horváth, Eduard Gorbunov

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our work makes a further step in this direction by providing several (provably) Byzantine-robust methods for distributed variational inequality, thoroughly studying their theoretical convergence, removing the limitations of the previous work, and providing numerical comparisons supporting the theoretical findings.
Researcher Affiliation Academia Nazarii Tupitsa MBZUAI, MIPT Abdulla Jasem Almansoori MBZUAI Yanlin Wu MBZUAI Martin Takáˇc MBZUAI Karthik Nandakumar MBZUAI Samuel Horváth MBZUAI Eduard Gorbunov
Pseudocode Yes Algorithm 1 SGDA-RA; Algorithm 2 SEG-RA; Algorithm 3 M-SGDA-RA; Algorithm 4 Check Computations; Algorithm 5 SGDA-CC; Algorithm 6 R-SGDA-CC; Algorithm 7 SEG-CC; Algorithm 8 R-SEG-CC.
Open Source Code Yes Code for quadratic games is available at https://github.com/nazya/sgda-ra7. ... Code for GANs is available at https://github.com/zeligism/vi-robust-agg.
Open Datasets Yes We conduct numerical experiments on a quadratic game... Robust Neural Networks training. ... {(xi, yi)}N 1 is the MNIST dataset. ... The dataset we chose for this experiment is CIFAR-10.
Dataset Splits Yes Specifically, we show the validation error on MNIST after each epoch.
Hardware Specification No The paper mentions "simulate n = 20 nodes on a single machine" but does not provide specific hardware details such as CPU/GPU models, processor types, or memory.
Software Dependencies No The paper does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, TensorFlow, etc.).
Experiment Setup Yes We set the parameter α = 0.1 for M-SGDA-RA, and the following parameters for RDEG: αRDEG = 0.06, δRDEG = 0.9 and theoretical value of ϵ. ... γ = 2e 5. ... We fix the learning rate to 0.01 and use a batch size of 32. We run the algorithm for 50 epochs and average our results across 3 runs. ... We let n = 20, B = 4, λ1 = 0, and λ2 = 100. ... We let n = 10, B = 2, and choose a learning rate of 0.001, β1 = 0.5, and β2 = 0.9 with a batch size of 64. We run the algorithms for 4600 epochs.