Generalization Analysis of Deep Non-linear Matrix Completion
Authors: Antoine Ledent, Rodrigo Alves
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide generalization bounds for matrix completion with Schatten p quasi-norm constraints, which is equivalent to deep matrix factorization with Frobenius constraints. In the uniform sampling regime, the sample complexity scales like r O prnq where n is the size of the matrix and r is a constraint of the same order as the ground truth rank in the isotropic case. In the distributionfree setting, the bounds scale as r O r1 p 2 , which reduces to the familiar ?rn 3 2 for p 1. Furthermore, we provide an analogue of the weighted trace norm for this setting which brings the sample complexity down to r Opnrq in all cases. We then present a non-linear model, Functionally Rescaled Matrix Completion (FRMC) which applies a single trainable function from R Ñ R to each entry of a latent matrix, and prove that this adds only negligible terms of the overall sample complexity, whilst experiments demonstrate that this simple model improvement already leads to significant gains on real data. |
| Researcher Affiliation | Academia | 1School of Computing and Information Sciences, Singapore Management University, Singapore 2Department of Applied Mathematics, Czech Technical University in Prague, Prague, Czech Republic. |
| Pseudocode | No | The paper describes methods in text but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing source code or a link to a code repository. It mentions using TensorFlow 2 for optimization but not open-sourcing its own implementation. |
| Open Datasets | Yes | We chose three standard datasets to evaluate our method in a real data scenario: DOUBAN and Movie Lens 25M (MVL25) from the recommender systems domain, and Last FM, which stores listening habits of users in a music streaming platform. |
| Dataset Splits | Yes | For all synthetic data experiments, we employed a %obs of the data for training (specified in each case), allocating 10% for validation, and utilizing the remaining portion as the test set. ... Regarding our validation procedure, we randomly split the observed entries uniformly, resulting in 90% for training and 5% for each validation and test set. |
| Hardware Specification | Yes | Regarding hardware specifications, all synthetic experiments were executed on a CPU cluster with 128 threads and 512GB of RAM. ... Real-world experiments were conducted on Nvidia DGX-A100 graphics cards with 40GB of GPU RAM. |
| Software Dependencies | Yes | We optimize all models with ADAM using Nesterov optimization through Tensor Flow 2. ... Similar to the synthetic data experiments, we optimized the models using ADAM with Nesterov optimization in Tensor Flow 2. |
| Experiment Setup | Yes | In the validation procedure, we selected the regularization parameter from the set λ P t10 7, 10 6, . . . , 102u and fixed the size of the embeddings to 20. We optimize all models with ADAM using Nesterov optimization through Tensor Flow 2. We consider a maximum of 100 epochs with early stopping and a patience of 5 in the validation loss, returning the best weights. ... We set a maximum of 100 epochs with early stopping, employing a patience of 15 based on the validation loss for all models. The best weights were selected during training. Real-world experiments were conducted on Nvidia DGX-A100 graphics cards with 40GB of GPU RAM. ... The parameter λ was selected from a sequence exponentially distributed between 10 7 and 102. For DOUBAN and LASTFM, this sequence has a size of 50, and for MVL25, it has a size of 15. For all datasets and methods, we fixed the embeddings to have a size of 128. |