Detached Error Feedback for Distributed SGD with Random Sparsification
Authors: An Xu, Heng Huang
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive deep learning experiments show significant empirical improvement of the proposed methods under various settings. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA 15213, USA. |
| Pseudocode | Yes | Algorithm 1 Detached Error Feedback (DEF(-A)). |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about releasing the source code for the described methodology. |
| Open Datasets | Yes | Extensive deep image classification experiments on CIFAR-10/100 and Image Net show significant improvements of DEF(-A) over existing works with RBGS. |
| Dataset Splits | Yes | The model is trained for 200 epochs with a learning rate decay of 0.1 at epoch 100 and 150. Random cropping, random flipping, and standardization are applied as data augmentation techniques. |
| Hardware Specification | Yes | Each machine is equipped with 4 NVIDIA P40 GPUs and there are 16 workers (GPUs) in total. |
| Software Dependencies | No | The paper mentions 'Py Torch' and 'NCCL as the backend of the Py Torch distributed package' but does not specify version numbers for any software component. |
| Experiment Setup | Yes | The base learning rate is tuned from { , 0.1, 0.05, 0.01, } and the batch size is 128. The momentum constant is 0.9 and the weight decay is 5 10 4. The model is trained for 200 epochs with a learning rate decay of 0.1 at epoch 100 and 150. Random cropping, random flipping, and standardization are applied as data augmentation techniques. |