Matrix Completion from Fewer Entries: Spectral Detectability and Rank Estimation

Authors: Alaa Saade, Florent Krzakala, Lenka Zdeborová

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, in Sec. 4 we present numerical simulations that demonstrate the efficiency of Ma CBet H.
Researcher Affiliation Academia Alaa Saade1 and Florent Krzakala1,2 1 Laboratoire de Physique Statistique, CNRS & École Normale Supérieure, Paris, France. 2Sorbonne Universités, Université Pierre et Marie Curie Paris 06, F-75005, Paris, France Lenka Zdeborová Institut de Physique Théorique, CEA Saclay and CNRS UMR 3681, 91191 Gif-sur-Yvette, France
Pseudocode Yes Algorithm (Ma CBet H) 1. Numerically solve for the value of ˆβSG such that F(ˆβSG) = 1, where F(β) := 1 nm (i,j) E tanh2(βMij) . (5) 2. Build the Bethe Hessian H(ˆβSG) following eq. (4). 3. Compute all its negative eigenvalues λ1, , λˆr and corresponding eigenvectors v1, , vˆr. ˆr is our estimate for the rank r. Set X0 (resp. Y0) to be the first n lines (resp. the last m lines) of the matrix [v1 v2 vˆr]. 4. Perform local optimization of the cost function (2) with rank ˆr and initial condition X0, Y0.
Open Source Code Yes Implementations of our algorithms in the Julia and Matlab programming languages are available at the SPHINX webpage http://www.lps.ens.fr/~krzakala/WASP.html.
Open Datasets No In our numerical examples and theoretical justifications we shall generate the low rank matrix Mtrue = XY T, using tall matrices X and Y with iid Gaussian elements, we call this the random matrix setting.
Dataset Splits No The paper uses synthetically generated data and does not mention specific train/validation/test splits from a pre-existing dataset.
Hardware Specification No The paper describes the computational tasks and the size of matrices used, but it does not specify any particular hardware (CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Julia and Matlab programming languages' and refers to 'low storage BFGS algorithm [25] part of NLopt [26]' but does not provide specific version numbers for any software components.
Experiment Setup Yes All methods optimize the cost function (2) using a low storage BFGS algorithm [25] part of NLopt [26], starting from different initial conditions. The maximum number of iterations was set to 1000. The probabilities were estimated as the frequency of success over 100 samples of matrices XY T of size 10000 10000, with the entries of X, Y drawn from a Gaussian distribution of mean 0 and variance 1.