Practical Low-Rank Communication Compression in Decentralized Deep Learning
Authors: Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Out of the box, these compressors perform on par with state-of-the-art tuned compression algorithms in a series of deep learning benchmarks. |
| Researcher Affiliation | Academia | Thijs Vogels EPFL Sai Praneeth Karimireddy EPFL Martin Jaggi EPFL |
| Pseudocode | Yes | Algorithm 1 Decentralized SGD with edge-wise compression; Algorithm 2 Rank-1 s-step Power Gossip compression for Algorithm 1 |
| Open Source Code | Yes | This paper s code is available at https://github.com/epfml/powergossip. |
| Open Datasets | Yes | We study the algorithm on the Cifar-10 image classification benchmark... We also follow the language modeling experiment on Wiki Text-2... 64 64 images from the Faces Database (AT&T Laboratories Cambridge). (Referenced with URL https://scikit-learn.org/0.19/datasets/olivetti_faces.html) |
| Dataset Splits | No | The paper mentions using standard datasets like Cifar-10 and Wiki Text-2 and states 'labeled images that are reshuffled between 8 workers every epoch,' but it does not explicitly provide percentages or sample counts for training, validation, or test splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | In all experiments, we tune the hyperparameters of our baselines according to Appendix G and use the same learning rate as uncompressed centralized SGD for all instances of Power Gossip. Our compression level is varied through the number of power iterations per gradient update. |