Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization
Authors: Tian Ye, Simon S. Du
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper presents the first proof that shows randomly initialized gradient descent converges to a global minimum of the asymmetric low-rank factorization problem with a polynomial rate. For the proof, we develop 1) a new symmetrization technique to capture the magnitudes of the symmetry and asymmetry, and 2) a quantitative perturbation analysis to approximate matrix derivatives. We believe both are useful for other related non-convex problems. ... The goal of this paper is not to provide new provably efficient algorithms to solve Problem (1), but to provide a rigorous analysis of an intriguing and practically relevant phenomenon on gradient descent. |
| Researcher Affiliation | Academia | Tian Ye Institute for Interdisciplinary Information Sciences Tsinghua University yet17@mails.tsinghua.edu.cn Simon S. Du Paul G. Allen School of Computer Science and Engineering University of Washington ssdu@cs.washington.edu |
| Pseudocode | No | The paper presents equations for gradient descent updates (e.g., equations 2 and 3) but does not provide them in a formally structured pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any statements about releasing source code or links to a code repository for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not report on experiments using datasets for training or evaluation. It references empirical observations from prior work but does not use its own data. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical validation on datasets, thus no dataset split information for validation is provided. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup, including hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software or libraries with version numbers used for experiments. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup details, such as hyperparameters or training configurations. |