Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Well-Tempered Landscape for Non-convex Robust Subspace Recovery

Authors: Tyler Maunu, Teng Zhang, Gilad Lerman

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a mathematical analysis of a non-convex energy landscape for robust subspace recovery. We prove that an underlying subspace is the only stationary point and local minimizer in a speciﬁed neighborhood under a deterministic condition on a dataset. If the deterministic condition is satisﬁed, we further show that a geodesic gradient descent method over the Grassmannian manifold can exactly recover the underlying subspace when the method is properly initialized. Proper initialization by principal component analysis is guaranteed with a simple deterministic condition. Under slightly stronger assumptions, the gradient descent method with a piecewise constant step-size scheme achieves linear convergence. The practicality of the deterministic condition is demonstrated on some statistical models of data, and the method achieves almost state-of-the-art recovery guarantees on the Haystack Model for diﬀerent regimes of sample size and ambient dimension. In particular, when the ambient dimension is ﬁxed and the sample size is large enough, we show that our gradient method can exactly recover the underlying subspace for any ﬁxed fraction of outliers (less than 1). ... In this section, we run two simulations to verify the theory we proved earlier. All experiments are run using the Haystack Model.
Researcher Affiliation	Academia	Tyler Maunu EMAIL Department of Mathematics Massachusetts Institute of Technology Cambridge, MA 02139, Teng Zhang EMAIL Department of Mathematics University of Central Florida Orlando, FL 32816, Gilad Lerman EMAIL School of Mathematics University of Minnesota Minneapolis, MN 55455
Pseudocode	Yes	Algorithm 1 RSR by Geodesic Gradient Descent (GGD) 1: Input: dataset X, subspace dimension d, initial step-size s, tolerance τ, constant step interval length K 2: Output: V O(D, d), whose columns span the robust subspace 3: V 1 = PCA(X, d) 4: k = 1, s = 1 5: while θ1(V k, V k 1) > τ or k = 1 do 6: Compute F(V k; X) by (6) and (8) 7: Compute the SVD U kΣk W k = F(V k; X) 8: sk = s/2 k/K 9: V k+1 = V k W k cos(Σksk)W k T + U k sin(Σksk)W k T (22) 10: k = k + 1 11: end while
Open Source Code	Yes	A supplementary web page with code will be made available at https://twmaunu.github.io/WTL/.
Open Datasets	No	The paper uses simulated data generated from statistical models such as the Haystack Model and Generalized Haystack Model. While the models are described in detail, there is no mention of using or providing access to any pre-existing, publicly available datasets for experiments. The text focuses on how data is generated for simulations rather than obtained from external sources.
Dataset Splits	No	The paper uses simulated data based on statistical models. It does not mention any explicit training/test/validation splits or cross-validation as it generates data for each simulation instance rather than splitting a fixed dataset.
Hardware Specification	No	The paper describes running simulations but does not specify any particular hardware components like CPU models, GPU models, or detailed computing environments.
Software Dependencies	No	The paper mentions 'MATLAB' in the context of machine precision limits but does not specify a version number or any other software libraries or dependencies with version numbers.
Experiment Setup	Yes	Data was generated for the Haystack Model with parameters Nin = 200, σin = 1, Nout = 200, σout = 1, D = 200, and d = 10. ... The data was generated according to the Haystack Model with parameters Nin = 200, σin = 1, Nout = 200, σout = 1, D = 100, and d = 5. ... For all algorithms, the initial step-size is s = 1/D. For the piecewise constant step-size scheme of Algorithm 1, we use two difference combinations of shrink-factor and time between shrinking, K.