Convergence Analysis of Gradient Descent for Eigenvector Computation

Authors: Zhiqiang Xu, Xin Cao, Xin Gao

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Theoretical properties are verified in experiments.
Researcher Affiliation Academia 1 KAUST, Saudi Arabia 2 UNSW, Australia zhiqiang.xu@kaust.edu.sa, xin.cao@unsw.edu.au, xin.gao@kaust.edu.sa
Pseudocode No The paper contains mathematical equations and derivations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any specific links to open-source code or explicitly state that the code for the described methodology is publicly available.
Open Datasets No For synthetic data, the paper describes how it is generated (e.g., "follow [Shamir, 2015] to generate data") but does not provide a direct link to a publicly available dataset or a citation with author names and year for the specific dataset used. For real data, it mentions "Schenk1" and provides a URL "www.cise.ufl.edu/research/sparse/matrices/" but this is a general collection and not a direct link/citation to a specific dataset or file used within the context of this paper.
Dataset Splits No The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'matlab s eigs function' but does not specify any software dependencies with version numbers.
Experiment Setup Yes If the initial x0 = y y 2 where entries of y are i.i.d. standard normal samples, i.e., yi N(0, 1)... The solver with constant step-sizes α < p 2λ2 1(1+α p) will converge... the solver with diminishing step-sizes αt = c τ+t for sufficiently large constants c, τ > 0 will converge... We set n = 1000 and follow [Shamir, 2015] to generate data using the full eigenvalue decomposition A = UΣU . Σ = diag(Σ1, Σ2), where Σ2 = diag( |g1| / n , ..., |gn r| / n ) with gi N(0, 1) and r being the order of Σ1. In addition, the following two settings are considered: p = 1: Σ1 = diag(1, 1 η, 1 1.1η, 1 1.2η, 1 1.3η, 1 1.4η), and then p / λ1 = η > 0, where η {0.2, 0.5, 0.8}. p = 3: Σ1 = diag(1, 1, 1), and then p / λ1 = 1 |g1| / n . (Figure 1 labels: c=12.87, τ=10; c=14.48, τ=10; c=16.09, τ=10).