Distributed Newton Can Communicate Less and Resist Byzantine Workers
Authors: Avishek Ghosh, Raj Kumar Maity, Arya Mazumdar
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Furthermore, we validate our theoretical results with extensive experiments on synthetic and benchmark LIBSVM [4] data-sets and demonstrate convergence guarantees. |
| Researcher Affiliation | Academia | Avishek Ghosh Department of EECS, UC Berkeley Berkeley, CA 94720 avishek_ghosh@berkeley.edu Raj Kumar Maity College of Information and Computer Sciences UMass Amherst, MA-01002 rajkmaity@cs.umass.edu Arya Mazumdar College of Information and Computer Sciences UMass Amherst, MA-01002 arya@cs.umass.edu |
| Pseudocode | Yes | Algorithm 1 COMmunication-efficient and Robust Approximate Distributed n Ewton (COMRADE) |
| Open Source Code | No | The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We choose a9a (d = 123, n ≈ 32K), w5a (d = 300, n ≈ 10k), Epsilon (d = 2000, n = 0.4M) and covtype.binary (d = 54, n ≈ 0.5M) classification datasets and partition the data in 20 different worker machines. In the experiments, we choose two types of Byzantine attacks : (1). flipped label -attack where (for binary classification) the Byzantine worker machines flip the labels of the data, thus making the model learn with wrong labels, and (2). negative update attack where the Byzantine worker machines compute the local update (ˆpi) and communicate c ˆpi with c ∈ (0, 1) making the updates to be opposite of actual direction. We choose β = α+ 2/m. We choose the regularization parameter λ = 1 and fixed step size. We ran the algorithms sufficient number of steps to ensure convergence. |
| Dataset Splits | No | The paper mentions using specific datasets but does not explicitly state the train/validation/test splits, percentages, or sample counts used for reproduction. It only states that data is partitioned among worker machines. |
| Hardware Specification | No | The paper states: "We use mpi4py package for distributed framework (swarm2) at the University of Massachusetts Amherst [28] using mpi4py Python package." While it mentions a cluster (swarm2), it does not provide any specific details about the hardware components (e.g., CPU, GPU models, memory). |
| Software Dependencies | No | The paper mentions using the "mpi4py package" and "Python package" but does not specify any version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We choose the regularization parameter λ = 1 and fixed step size. We choose β = α+ 2/m. We ran the algorithms sufficient number of steps to ensure convergence. We consider several types of Byzantine attacks : (1). flipped label -attack where (for binary classification) the Byzantine worker machines flip the labels of the data, thus making the model learn with wrong labels, and (2). negative update attack where the Byzantine worker machines compute the local update (ˆpi) and communicate c ˆpi with c ∈ (0, 1) making the updates to be opposite of actual direction. |