Quantized Decentralized Stochastic Learning over Directed Graphs
Authors: Hossein Taheri, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical evaluations corroborate our main theoretical results and illustrate significant speed-up compared to the exact-communication methods. Numerical Experiments In this section, we compare the proposed methods for communication-efficient message passing over directed graphs, with the push-sum protocol using exact communication (e.g., as formulated in (Kempe et al., 2003; Tsianos et al., 2012) for gossip or optimization problems). |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of California, Santa Barbara, Santa Barbara, CA, USA. 2Department of Electrical and Computer Engineering, University of Texas at Austin, Austin, TX, USA. 3Department of Electrical and Systems Engineering, University of Pennsylvania, Philadelphia, PA, USA. |
| Pseudocode | Yes | Algorithm 1 Quantized Push-sum for Consensus over Directed Graphs; Algorithm 2 Quantized Decentralized SGD over Directed Graphs |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the release of its source code. |
| Open Datasets | Yes | In order to illustrate this, we train a neural-network with 10 hidden units with sigmoid activation function to classify the MNIST dataset into 10 classes. |
| Dataset Splits | No | The paper mentions using training and test sets (implicitly for MNIST) but does not provide specific details about a validation set split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies (e.g., programming languages, libraries, frameworks). |
| Experiment Setup | Yes | The step size for each setting, is fine-tuned up to iteration 50 among 20 values in the interval [0.01, 3]. For each setting, the step-size is fine-tuned up to iteration 200 and over 15 values in the interval [0.1, 3]. We use the graph G1 with 10 nodes where each node has access to 1000 samples of data-set and uses a randomly selected mini-batch of size 10 for computing the local stochastic gradient descent. |