Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes
Authors: Chris Junchi Li, Zhaoran Wang, Han Liu
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we propose a new analytic paradigm based on diffusion processes to characterize the global dynamics of nonconvex statistical optimization. As a concrete example, we study stochastic gradient descent (SGD) for the tensor decomposition formulation of independent component analysis. In particular, we cast different phases of SGD into diffusion processes, i.e., solutions to stochastic differential equations. Initialized from an unstable equilibrium, the global dynamics of SGD transit over three consecutive phases: (i) an unstable Ornstein-Uhlenbeck process slowly departing from the initialization, (ii) the solution to an ordinary differential equation, which quickly evolves towards the desirable local minimum, and (iii) a stable Ornstein-Uhlenbeck process oscillating around the desirable local minimum. Our proof techniques are based upon Stroock and Varadhan s weak convergence of Markov chains to diffusion processes, which are of independent interest. |
| Researcher Affiliation | Academia | Chris Junchi Li Zhaoran Wang Han Liu Department of Operations Research and Financial Engineering, Princeton University |
| Pseudocode | No | The paper describes the SGD algorithm via Eq. (2.3) but does not present it in a pseudocode block or algorithm format. |
| Open Source Code | No | The paper does not provide any statement about releasing source code or links to a code repository. |
| Open Datasets | No | The paper is theoretical and does not use or provide access information for a specific dataset for training. |
| Dataset Splits | No | The paper does not describe any experimental setup or data splits for training, validation, or testing. |
| Hardware Specification | No | The paper focuses on theoretical analysis and does not mention any hardware specifications used for experiments. |
| Software Dependencies | No | The paper focuses on theoretical analysis and does not list any specific software dependencies or their versions. |
| Experiment Setup | No | The paper focuses on theoretical analysis and does not provide any specific experimental setup details or hyperparameters. |