Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data
Authors: Yasaman Esfandiari, Sin Yong Tan, Zhanhong Jiang, Aditya Balu, Ethan Herron, Chinmay Hegde, Soumik Sarkar
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical comparisons show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms, as well as maintaining the improved performance under information compression to reduce peer-to-peer communication overhead. ... We present the empirical studies on CIFAR-10 and MNIST datasets ... We compare the effectiveness of our algorithms with other baseline decentralized algorithms such as Swarm SGD (Nadiradze et al., 2019), SGP (Assran et al., 2019), and the momentum variant of DPSGD (Lian et al., 2017) (DPMSGD). |
| Researcher Affiliation | Collaboration | 1Department of Mechanical Engineering, Iowa State University, Ames, Iowa, USA 2Johnson Controls, Milwaukee, Wisconsin, USA 3Computer Science and Engineering Department, New York University, New York City, New York, USA. |
| Pseudocode | Yes | Algorithm 1 Cross-Gradient Aggregation (CGA) ... Algorithm 2 Compressed Cross-Gradient Aggregation (Comp CGA) |
| Open Source Code | Yes | The code is available here on Git Hub. ... Our code is publicly available on Git Hub1. 1https://github.com/yasesf93/Cross Gradient Aggregation |
| Open Datasets | Yes | We present the empirical studies on CIFAR-10 and MNIST datasets |
| Dataset Splits | No | The paper mentions evaluating on 'local test sets' but does not provide specific training/validation/test dataset splits (e.g., percentages, counts, or explicit references to predefined splits) to reproduce the data partitioning. |
| Hardware Specification | Yes | The experiments are performed on a large high-performance computing cluster with a total of 192 GPUs distributed over 24 nodes. Each node in the cluster is made of 2 Intel Xeon Gold 6248 CPUs with each 20 cores and 8 Tesla V100 32GB SXM2 GPUs. |
| Software Dependencies | No | The paper mentions using models like CNN and VGG11 but does not specify software versions for libraries, frameworks, or languages (e.g., Python, PyTorch, TensorFlow versions) that would be needed for reproducibility. |
| Experiment Setup | Yes | A mini-batch size of 128 is used, the initial step-size is set to 0.01 for CIFAR-10, and step size is decayed with constant 0.981. The stopping criterion is a fixed number of epochs and the momentum parameter (β) is set to be 0.98. |