Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent

Authors: Tian Tong, Cong Ma, Yuejie Chi

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, numerical examples are provided to demonstrate the effectiveness of Scaled GD in accelerating the convergence rate of ill-conditioned low-rank matrix estimation in a wide number of applications. Keywords: low-rank matrix factorization, scaled gradient descent, ill-conditioned matrix recovery, matrix sensing, robust PCA, matrix completion, general losses
Researcher Affiliation Academia Tian Tong EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA 15213, USA Cong Ma EMAIL Department of Electrical Engineering and Computer Sciences University of California, Berkeley Berkeley CA 94720, USA Yuejie Chi EMAIL Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA 15213, USA
Pseudocode Yes Algorithm 1 Scaled GD for low-rank matrix sensing with spectral initialization Algorithm 2 Scaled GD for robust PCA with spectral initialization Algorithm 3 Scaled PGD for matrix completion with spectral initialization
Open Source Code Yes In this section, we provide numerical experiments to corroborate our theoretical findings, with the codes available at https://github.com/Titan-Tong/Scaled GD.
Open Datasets No The paper describes how the authors generated synthetic datasets for their numerical experiments (e.g., "we generate the ground truth matrix X Rn n in the following way. We first generate an n r matrix with i.i.d. random signs..."). It does not use or provide concrete access information for publicly available datasets.
Dataset Splits No The paper generates synthetic data for its numerical experiments but does not specify training, test, or validation splits. It describes parameters for data generation (e.g., m = 5nr measurements, α = 0.1 corruption, p = 0.2 observation probability) but not data partitioning for machine learning tasks.
Hardware Specification Yes The simulations are performed in Matlab with a 3.6 GHz Intel Xeon Gold 6244 CPU.
Software Dependencies No The simulations are performed in Matlab with a 3.6 GHz Intel Xeon Gold 6244 CPU. While Matlab is mentioned, no specific version number is provided for Matlab or any other software dependencies.
Experiment Setup Yes For ease of comparison, we fix η = 0.5 for both Scaled GD and vanilla GD (see Figure 4 for justifications). Both algorithms start from the same spectral initialization.