Gen-Oja: Simple & Efficient Algorithm for Streaming Generalized Eigenvector Computation

Authors: Kush Bhatia, Aldo Pacchiano, Nicolas Flammarion, Peter L. Bartlett, Michael I. Jordan

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here we illustrate the practical utility of Gen-Oja on a synthetic, streaming generalized eigenvector problem. We take d = 20 and T = 10^6. The streams (At, Bt) (Rd d)^2 are normally-distributed with covariance matrix A and B with random eigenvectors and eigenvalues decaying as 1/i, for i = 1, . . . , d. Here R^2 denotes the radius of the streams with R^2 = max{Tr A, Tr B}. All results are averaged over ten repetitions. Comparison with two-steps methods. In the left plot of Figure 1 we compare the behavior of Gen-Oja to different two-steps algorithms.
Researcher Affiliation Academia Kush Bhatia University of California, Berkeley kushbhatia@berkeley.edu Aldo Pacchiano University of California, Berkeley pacchiano@berkeley.edu Nicolas Flammarion University of California, Berkeley flammarion@berkeley.edu Peter L. Bartlett University of California, Berkeley peter@berkeley.edu Michael I. Jordan University of California, Berkeley jordan@cs.berkeley.edu
Pseudocode Yes Algorithm 1: Gen-Oja for Streaming Av = λBv
Open Source Code No The paper does not provide any explicit statements about open-source code availability or links to code repositories.
Open Datasets No The paper uses a synthetic dataset generated for the simulations: "The streams (At, Bt) (Rd d)^2 are normally-distributed with covariance matrix A and B with random eigenvectors and eigenvalues decaying as 1/i, for i = 1, . . . , d." It does not provide access information for a publicly available or open dataset as it's generated for the experiment.
Dataset Splits No The paper describes a streaming problem with synthetic data generation; it does not mention traditional training, validation, or test splits. The evaluation is conducted on the generated streams.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes Fix any δ > 0 and ϵ1 > 0. Suppose that the step sizes are set to αt = c log(d2β+t) and βt = γ λ(d2β+t) for γ > 1/2 , c > 1 and ... In the middle plot of Figure 1 we compare the behavior of Gen-Oja for step size α {α , α /8, α /16} where α = 1/R2. We observe that Gen-Oja converges at a rate O(1/t) independently of the choice of α. Robustness to incorrect step-size βt. In the right plot of Figure 1 we compare the behavior of Gen-Oja for step size βt {β /t, β /16t, β /i} where β corresponds to the minimal error after one pass over the data.