Implicit Regularization in Matrix Factorization
Authors: Suriya Gunasekar, Blake E. Woodworth, Srinadh Bhojanapalli, Behnam Neyshabur, Nati Srebro
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conjecture and provide empirical and theoretical evidence that with small enough step sizes and initialization close enough to the origin, gradient descent on a full dimensional factorization converges to the minimum nuclear norm solution. and Our empirical study leads us to conjecture that with small step sizes and initialization close to zero, gradient descent converges to the minimum nuclear norm solution, and we provide empirical and theoretical evidence for this conjecture, proving it in certain restricted settings. (Abstract and Introduction) and Beyond the matrix reconstruction experiments of Section 2, we also conducted experiments with similarly simulated matrix completion problems, including problems where entries are sampled from power-law distributions (thus not satisfying incoherence), as well as matrix completion problem on non-simulated Movielens data. In addition to gradient descent, we also looked more directly at the gradient flow ODE (3) and used a numerical ODE solver provided as part of Sci Py [8] to solve (3). (Section 5) |
| Researcher Affiliation | Academia | Suriya Gunasekar TTI at Chicago suriya@ttic.edu Blake Woodworth TTI at Chicago blake@ttic.edu Srinadh Bhojanapalli TTI at Chicago srinadh@ttic.edu Behnam Neyshabur TTI at Chicago behnam@ttic.edu Nathan Srebro TTI at Chicago nati@ttic.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about open-sourcing code for the described methodology or any links to code repositories. |
| Open Datasets | Yes | (iv) Benchmark movie recommendation dataset Movielens 100k. The dataset contains 100k ratings from n1 = 943 users on n2 = 1682 movies. |
| Dataset Splits | No | The paper mentions 'training data' and 'test error is computed on a held out data of 10 ratings per user' for Movielens, but does not provide specific details on the dataset split percentages, sample counts for each split, or explicit cross-validation methodology. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Sci Py' for a numerical ODE solver but does not provide a specific version number for this software or any other key software dependencies. |
| Experiment Setup | Yes | Figure 1 shows the normalized training objective and reconstruction error as a function of the dimensionality d of the factorization, for different initialization and step-size policies... and U0 F = 10 4, η, U0 F = 10 4, η = 10 3, U0 F = 1, η = 10 3 in Figure 1 legend. Also, for the exhaustive search: run the ODE solver on every remaining instance with a random U0 such that U0 F = α, for different values of α. Results on the deviation from the minimum nuclear norm are reported in Figure 4. For small α = 10 5, 10 3... This behavior also decays for α = 1. |