On Markov Chain Gradient Descent
Authors: Tao Sun, Yuejiao Sun, Wotao Yin
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present two kinds of numerical results. The first one is to show that MCGD uses fewer samples to train both a convex model and a nonconvex model. The second one demonstrates the advantage of the faster mixing of a non-reversible Markov chain. Our results on nonconvex objective and non-reversible chains are new. |
| Researcher Affiliation | Academia | Tao Sun College of Computer National University of Defense Technology Changsha, Hunan 410073, China nudtsuntao@163.com Yuejiao Sun Department of Mathematics University of California, Los Angeles Los Angeles, CA 90095, USA sunyj@math.ucla.edu Wotao Yin Department of Mathematics University of California, Los Angeles Los Angeles, CA 90095, USA wotaoyin@math.ucla.edu |
| Pseudocode | No | The paper describes algorithms mathematically using equations (e.g., (2), (3), (5)), but does not present them in a structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for their methodology is openly available. |
| Open Datasets | No | The paper describes generating data for its experiments (e.g., 'Randomly sample a vector u Rd, d = 50' and 'construct an undirected connected graph with n = 20 nodes with edges randomly generated') rather than using a publicly available dataset with concrete access information or citations. |
| Dataset Splits | No | The paper does not explicitly provide details about validation dataset splits. It discusses 'training' models but not a separate 'validation' set. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch x.x). |
| Experiment Setup | Yes | We choose γk = 1 kq as our stepsize, where q = 0.501. This choice is consistently with our theory below. |