How Much Restricted Isometry is Needed In Nonconvex Matrix Recovery?
Authors: Richard Zhang, Cedric Josz, Somayeh Sojoudi, Javad Lavaei
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | One specific counterexample has RIP constant δ = 1/2, but causes randomly initialized stochastic gradient descent (SGD) to fail 12% of the time. SGD is frequently able to avoid and escape spurious local minima, but this empirical result shows that it can occasionally be defeated by their existence. Hence, while exact recovery guarantees will likely require a proof of no spurious local minima, arguments based solely on norm preservation will only be applicable to a narrow set of nearly-isotropic instances. Repeating these trials 100,000 times yields 87,947 successful trials, for a failure rate of 12.1 0.3% to three standard deviations. In Section 4, we provide strong empirical evidence for the following sharp version of Theorem 2. In Section 5, we apply randomly initialized SGD to instances of (1) engineered to contain spurious local minima. |
| Researcher Affiliation | Academia | Richard Y. Zhang University of California, Berkeley ryz@alum.mit.edu Cédric Josz University of California, Berkeley cedric.josz@gmail.com Somayeh Sojoudi University of California, Berkeley sojoudi@berkeley.edu Javad Lavaei University of California, Berkeley lavaei@berkeley.edu |
| Pseudocode | No | The paper describes mathematical formulations and procedures, but these are presented as equations and narrative text rather than structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific links to a code repository or explicit statements about releasing the source code for the described methodology. |
| Open Datasets | No | The experiments described in the paper use synthetically generated data (e.g., 'randomly sample x, z Rn r i.i.d. from the standard Gaussian') rather than named public datasets, and no access information for any dataset is provided. |
| Dataset Splits | No | The paper conducts experiments involving numerical optimization trials (e.g., '100,000 trials' of SGD), rather than training/evaluating models on predefined datasets. Consequently, there is no explicit mention of training, validation, or test dataset splits in the context of data partitioning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU/GPU models, memory, or cloud resources) used for running the experiments. |
| Software Dependencies | No | The paper mentions software tools like 'Se Du Mi', 'SDPT3', and 'MOSEK' for numerical solutions, but it does not provide specific version numbers for these or any other software dependencies used in the experiments. |
| Experiment Setup | Yes | Solving Example 3 using stochastic gradient descent randomly initialized with the standard Gaussian. (Left) Histogram over 100,000 trials of final error xx T Z F after 103 steps with learning rate α = 10 3 and momentum β = 0.9. (Left) Error distribution after 10,000 SGD steps (rate 10 4, momentum 0.9) over 1,000 trials. (rate 10 3, momentum 0.9) over 1,000 trials. |