Global Convergence of Block Coordinate Descent in Deep Learning
Authors: Jinshan Zeng, Tim Tsz-Kit Lau, Shaobo Lin, Yuan Yao
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As shown in Figure 1, it is observed that vanilla SGD fails to train a ten-hidden-layer MLPs while BCD still works and achieves a moderate accuracy within a few epochs. Refer to Appendix F for details of this experiment2. |
| Researcher Affiliation | Academia | 1School of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, Jiangxi, China 2Department of Mathematics, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong 3Department of Statistics, Northwestern University, Evanston, IL 60208, USA 4Department of Mathematics, City University of Hong Kong, Kowloon, Hong Kong. |
| Pseudocode | Yes | Algorithm 1 Two-splitting BCD for DNN Training (2.3) and Algorithm 2 Three-splitting BCD for DNN training (2.5) |
| Open Source Code | Yes | Codes available at: https://github.com/timlautk/ BCD-for-DNNs-Py Torch. |
| Open Datasets | Yes | Figure 1. Comparison of training and test accuracies of BCD and SGD for training ten-hidden-layer MLPs on the MNIST dataset. |
| Dataset Splits | No | Figure 1. Comparison of training and test accuracies of BCD and SGD for training ten-hidden-layer MLPs on the MNIST dataset. Refer to Appendix F for details of this experiment. The paper mentions the MNIST dataset but does not explicitly specify the training, validation, and test splits (e.g., percentages, sample counts, or a citation to a standard split) in the main text. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models. |
| Software Dependencies | No | Codes available at: https://github.com/timlautk/ BCD-for-DNNs-Py Torch. This indicates PyTorch is used, but no specific version numbers are provided for PyTorch or other software dependencies. |
| Experiment Setup | No | Refer to Appendix F for details of this experiment2. The main text does not contain specific hyperparameter values, training configurations, or system-level settings. |