Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes

Authors: Chris Junchi Li, Zhaoran Wang, Han Liu

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we propose a new analytic paradigm based on diffusion processes to characterize the global dynamics of nonconvex statistical optimization. As a concrete example, we study stochastic gradient descent (SGD) for the tensor decomposition formulation of independent component analysis. In particular, we cast different phases of SGD into diffusion processes, i.e., solutions to stochastic differential equations. Initialized from an unstable equilibrium, the global dynamics of SGD transit over three consecutive phases: (i) an unstable Ornstein-Uhlenbeck process slowly departing from the initialization, (ii) the solution to an ordinary differential equation, which quickly evolves towards the desirable local minimum, and (iii) a stable Ornstein-Uhlenbeck process oscillating around the desirable local minimum. Our proof techniques are based upon Stroock and Varadhan s weak convergence of Markov chains to diffusion processes, which are of independent interest.
Researcher Affiliation Academia Chris Junchi Li Zhaoran Wang Han Liu Department of Operations Research and Financial Engineering, Princeton University
Pseudocode No The paper describes the SGD algorithm via Eq. (2.3) but does not present it in a pseudocode block or algorithm format.
Open Source Code No The paper does not provide any statement about releasing source code or links to a code repository.
Open Datasets No The paper is theoretical and does not use or provide access information for a specific dataset for training.
Dataset Splits No The paper does not describe any experimental setup or data splits for training, validation, or testing.
Hardware Specification No The paper focuses on theoretical analysis and does not mention any hardware specifications used for experiments.
Software Dependencies No The paper focuses on theoretical analysis and does not list any specific software dependencies or their versions.
Experiment Setup No The paper focuses on theoretical analysis and does not provide any specific experimental setup details or hyperparameters.