Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates

Authors: Yuanzhi Li, Yingyu Liang, Andrej Risteski

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper proposes an algorithm that alternates between decoding the weights and updating the features, and shows that assuming a generative model of the data, it provably recovers the groundtruth under fairly mild conditions. The analysis relies on a carefully designed coupling between two potential functions, which we believe is of independent interest.
Researcher Affiliation Academia Yuanzhi Li, Yingyu Liang, Andrej Risteski Computer Science Department at Princeton University 35 Olden St, Princeton, NJ 08540 {yuanzhil, yingyul, risteski}@cs.princeton.edu
Pseudocode Yes Algorithm 1 Purification Input: initialization A(0), threshold α, step size η, scaling factor r, sample size N, iterations T 1: for t = 0, 1, 2, ..., T 1 do 2: Draw examples y1, . . . , y N. 3: (Decode) Compute A , the pseudo-inverse of A(t) with minimum (A) . Set x = φα(A y) for each example y. // φα is Re LU activation; see (2) for the definition 4: (Update) Update the feature matrix A(t+1) = (1 η) A(t) + rηˆE (y y )(x x ) where ˆE is over independent uniform y, y from {y1, . . . , y N}, and x, x are their decodings. Output: A = A(T )
Open Source Code No The paper refers to a third-party software (lda-c) with a GitHub link, but does not state that the code for its own methodology is publicly available.
Open Datasets No The paper is theoretical and assumes a generative model for data; it does not utilize or mention a specific dataset for training or evaluation.
Dataset Splits No The paper focuses on theoretical analysis and does not describe experimental validation or dataset splits for such validation.
Hardware Specification No The paper is theoretical and does not mention any hardware specifications used for experiments.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings.