Distributed Newton Can Communicate Less and Resist Byzantine Workers

Authors: Avishek Ghosh, Raj Kumar Maity, Arya Mazumdar

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Furthermore, we validate our theoretical results with extensive experiments on synthetic and benchmark LIBSVM [4] data-sets and demonstrate convergence guarantees.
Researcher Affiliation Academia Avishek Ghosh Department of EECS, UC Berkeley Berkeley, CA 94720 avishek_ghosh@berkeley.edu Raj Kumar Maity College of Information and Computer Sciences UMass Amherst, MA-01002 rajkmaity@cs.umass.edu Arya Mazumdar College of Information and Computer Sciences UMass Amherst, MA-01002 arya@cs.umass.edu
Pseudocode Yes Algorithm 1 COMmunication-efficient and Robust Approximate Distributed n Ewton (COMRADE)
Open Source Code No The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes We choose a9a (d = 123, n ≈ 32K), w5a (d = 300, n ≈ 10k), Epsilon (d = 2000, n = 0.4M) and covtype.binary (d = 54, n ≈ 0.5M) classification datasets and partition the data in 20 different worker machines. In the experiments, we choose two types of Byzantine attacks : (1). flipped label -attack where (for binary classification) the Byzantine worker machines flip the labels of the data, thus making the model learn with wrong labels, and (2). negative update attack where the Byzantine worker machines compute the local update (ˆpi) and communicate c ˆpi with c ∈ (0, 1) making the updates to be opposite of actual direction. We choose β = α+ 2/m. We choose the regularization parameter λ = 1 and fixed step size. We ran the algorithms sufficient number of steps to ensure convergence.
Dataset Splits No The paper mentions using specific datasets but does not explicitly state the train/validation/test splits, percentages, or sample counts used for reproduction. It only states that data is partitioned among worker machines.
Hardware Specification No The paper states: "We use mpi4py package for distributed framework (swarm2) at the University of Massachusetts Amherst [28] using mpi4py Python package." While it mentions a cluster (swarm2), it does not provide any specific details about the hardware components (e.g., CPU, GPU models, memory).
Software Dependencies No The paper mentions using the "mpi4py package" and "Python package" but does not specify any version numbers for these or other software dependencies.
Experiment Setup Yes We choose the regularization parameter λ = 1 and fixed step size. We choose β = α+ 2/m. We ran the algorithms sufficient number of steps to ensure convergence. We consider several types of Byzantine attacks : (1). flipped label -attack where (for binary classification) the Byzantine worker machines flip the labels of the data, thus making the model learn with wrong labels, and (2). negative update attack where the Byzantine worker machines compute the local update (ˆpi) and communicate c ˆpi with c ∈ (0, 1) making the updates to be opposite of actual direction.