Decoupled Parallel Backpropagation with Convergence Guarantee
Authors: Zhouyuan Huo, Bin Gu, Yang, Heng Huang
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we perform experiments for training deep convolutional neural networks on benchmark datasets. The experimental results not only confirm our theoretical analysis, but also demonstrate that the proposed method can achieve significant speedup without loss of accuracy. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, United States. |
| Pseudocode | Yes | Algorithm 1 SGD-DDG |
| Open Source Code | No | The paper refers to using PyTorch and the multiprocessing package (with a link to its documentation), but there is no explicit statement or link indicating that the authors' implementation code for DDG algorithm is open-source or publicly available. |
| Open Datasets | Yes | In this section, we experiment with Res Net (He et al., 2016) on image classification benchmark datasets: CIFAR-10 and CIFAR-100 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | No | The paper mentions using CIFAR-10 and CIFAR-100 and discusses training and testing, but it does not explicitly provide details about specific training/validation/test dataset splits (e.g., percentages, sample counts for a validation set, or a specific strategy for creating one). |
| Hardware Specification | Yes | In this section, we train Res Net-8 on CIFAR-10 on a single Titan X GPU. |
| Software Dependencies | No | The paper mentions using 'Py Torch library (Paszke et al., 2017)' and the 'multiprocessing package', but it does not specify exact version numbers for PyTorch or any other software dependencies crucial for replication (e.g., Python version, CUDA version). |
| Experiment Setup | Yes | All experiments are run for 300 epochs and optimized using Adam optimizer (Kingma & Ba, 2014) with a batch size of 128. The stepsize is initialized at 1 × 10−3. |