Robust and Communication-Efficient Collaborative Learning

Authors: Amirhossein Reisizadeh, Hossein Taheri, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we numerically evaluate the performance of the proposed Quan Timed-DSGD method described in Algorithm 1 for solving a class of non-convex decentralized optimization problems. In particular, we compare the total run-time of Quan Timed-DSGD scheme with the ones for three benchmarks which are briefly described below. ... We carry out two sets of experiments over CIFAR-10 and MNIST datasets...
Researcher Affiliation Academia Amirhossein Reisizadeh ECE Department University of California, Santa Barbara reisizadeh@ucsb.edu Hossein Taheri ECE Department University of California, Santa Barbara hossein@ucsb.edu Aryan Mokhtari ECE Department The University of Texas at Austin mokhtari@austin.utexas.edu Hamed Hassani ESE Department University of Pennsylvania hassani@seas.upenn.edu Ramtin Pedarsani ECE Department University of California, Santa Barbara ramtin@ece.ucsb.edu
Pseudocode Yes Algorithm 1 Quan Timed-DSGD at node i
Open Source Code No The paper does not provide an explicit statement or link indicating the availability of open-source code for the described methodology.
Open Datasets Yes We carry out two sets of experiments over CIFAR-10 and MNIST datasets, where each worker is assigned with a sample set of size m = 200 for both datasets.
Dataset Splits No The paper mentions using CIFAR-10 and MNIST datasets and assigns `m=200` samples per worker, but it does not explicitly provide information on train/validation/test splits (percentages, counts, or predefined citations).
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks.
Experiment Setup Yes In experiments over CIFAR-10, step-sizes are fine-tuned as follows: ( , ") = (0.08/T 1/6, 14/T 1/2) for Quan Timed-DSGD and Q-DSGD, and = 0.015 for DSGD and Asynchronous DSGD. In MNIST experiments, step-sizes are fine-tuned to ( , ") = (0.3/T 1/6, 15/T 1/2) for Quan Timed-DSGD and Q-DSGD, and = 0.2 for DSGD. We implement the unbiased low precision quantizer in (7) with various quantization levels s, and we let Tc denote the communication time of a p-vector without quantization (16-bit precision). The communication time for a quantized vector is then proportioned according the quantization level. In order to ensure that the expected batch size used in each node is a target positive number b, we choose the deadline Td = b/E[V ], where V Uniform(10, 90) is the random computation speed. The communication graph is a random Erdös-Rènyi graph with edge connectivity pc = 0.4 and n = 50 nodes. The weight matrix is designed as W = I L/ where L is the Laplacian matrix of the graph and > λmax(L)/2.