DoubleSqueeze: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression
Authors: Hanlin Tang, Chen Yu, Xiangru Lian, Tong Zhang, Ji Liu
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An empirical study is also conducted to validate our theoretical results. ... 5. Experiments We validate our theory with experiments that compared DOUBLESQUEEZE with other compression implementations. |
| Researcher Affiliation | Collaboration | 1University of Rochester 2Hong Kong University of Science and Technology 3Seattle AI Lab, Fe DA Lab, Kwai Inc. |
| Pseudocode | Yes | Algorithm 1 DOUBLESQUEEZE |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described, nor does it include a specific repository link or explicit code release statement. |
| Open Datasets | Yes | Datasets and models We evaluate DOUBLESQUEEZE by training Res Net-18 (He et al., 2016) on CIFAR-10. |
| Dataset Splits | No | The paper mentions training on CIFAR-10 and evaluating testing accuracy, but does not provide specific train/validation/test dataset splits, sample counts, or explicit cross-validation details for reproduction. |
| Hardware Specification | Yes | Each worker computes gradients on a Nvidia 1080Ti. |
| Software Dependencies | No | The paper mentions "TensorFlow" in its references, but does not specify the version of TensorFlow or any other key software components with their version numbers that are necessary to replicate the experiment. |
| Experiment Setup | Yes | The learning rate starts with 0.1 and is reduced by a factor of 10 every 160 epochs. The batch size is set to 256 on each worker. |