Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

A Well-Tempered Landscape for Non-convex Robust Subspace Recovery

Authors: Tyler Maunu, Teng Zhang, Gilad Lerman

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a mathematical analysis of a non-convex energy landscape for robust subspace recovery. We prove that an underlying subspace is the only stationary point and local minimizer in a specified neighborhood under a deterministic condition on a dataset. If the deterministic condition is satisfied, we further show that a geodesic gradient descent method over the Grassmannian manifold can exactly recover the underlying subspace when the method is properly initialized. Proper initialization by principal component analysis is guaranteed with a simple deterministic condition. Under slightly stronger assumptions, the gradient descent method with a piecewise constant step-size scheme achieves linear convergence. The practicality of the deterministic condition is demonstrated on some statistical models of data, and the method achieves almost state-of-the-art recovery guarantees on the Haystack Model for different regimes of sample size and ambient dimension. In particular, when the ambient dimension is fixed and the sample size is large enough, we show that our gradient method can exactly recover the underlying subspace for any fixed fraction of outliers (less than 1). ... In this section, we run two simulations to verify the theory we proved earlier. All experiments are run using the Haystack Model.
Researcher Affiliation Academia Tyler Maunu EMAIL Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139, Teng Zhang EMAIL Department of Mathematics University of Central Florida Orlando, FL 32816, Gilad Lerman EMAIL School of Mathematics University of Minnesota Minneapolis, MN 55455
Pseudocode Yes Algorithm 1 RSR by Geodesic Gradient Descent (GGD) 1: Input: dataset X, subspace dimension d, initial step-size s, tolerance τ, constant step interval length K 2: Output: V O(D, d), whose columns span the robust subspace 3: V 1 = PCA(X, d) 4: k = 1, s = 1 5: while θ1(V k, V k 1) > τ or k = 1 do 6: Compute F(V k; X) by (6) and (8) 7: Compute the SVD U kΣk W k = F(V k; X) 8: sk = s/2 k/K 9: V k+1 = V k W k cos(Σksk)W k T + U k sin(Σksk)W k T (22) 10: k = k + 1 11: end while
Open Source Code Yes A supplementary web page with code will be made available at https://twmaunu.github.io/WTL/.
Open Datasets No The paper uses simulated data generated from statistical models such as the Haystack Model and Generalized Haystack Model. While the models are described in detail, there is no mention of using or providing access to any pre-existing, publicly available datasets for experiments. The text focuses on how data is *generated* for simulations rather than obtained from external sources.
Dataset Splits No The paper uses simulated data based on statistical models. It does not mention any explicit training/test/validation splits or cross-validation as it generates data for each simulation instance rather than splitting a fixed dataset.
Hardware Specification No The paper describes running simulations but does not specify any particular hardware components like CPU models, GPU models, or detailed computing environments.
Software Dependencies No The paper mentions 'MATLAB' in the context of machine precision limits but does not specify a version number or any other software libraries or dependencies with version numbers.
Experiment Setup Yes Data was generated for the Haystack Model with parameters Nin = 200, σin = 1, Nout = 200, σout = 1, D = 200, and d = 10. ... The data was generated according to the Haystack Model with parameters Nin = 200, σin = 1, Nout = 200, σout = 1, D = 100, and d = 5. ... For all algorithms, the initial step-size is s = 1/D. For the piecewise constant step-size scheme of Algorithm 1, we use two difference combinations of shrink-factor and time between shrinking, K.