Matrix Completion with Noisy Side Information

Authors: Kai-Yang Chiang, Cho-Jui Hsieh, Inderjit S. Dhillon

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we consider synthetic data and two applications relationship prediction and semisupervised clustering and show that our model outperforms other methods for matrix completion that use features both in theory and practice. In addition, we empirically show that our model outperforms other completion methods on synthetic data as well as in two applications: relationship prediction and semi-supervised clustering. We show experimental results in Section 5.
Researcher Affiliation Academia Kai-Yang Chiang Cho-Jui Hsieh Inderjit S. Dhillon University of Texas at Austin University of California at Davis {kychiang,inderjit}@cs.utexas.edu chohsieh@ucdavis.edu
Pseudocode Yes Our algorithm is stated in details in Appendix A. Our algorithm, summarized in Algorithm 2 in Appendix D, first completes the pairwise matrix with Dirty IMC objective (2) instead of IMC (with both X, Y are set as Z), and then runs k-means on the top-k eigenvectors of the completed matrix to obtain a clustering.
Open Source Code No The paper does not contain any explicit statement or link indicating that the source code for the described methodology is open-source or publicly available.
Open Datasets Yes We consider relationship prediction problem in an online review website Epinions [26]. All datasets are available at http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/. For Covtype, we subsample from the entire dataset to make each cluster has balanced size.
Dataset Splits Yes We conduct the experiment using 10-fold cross validation on observed edges, where the parameters are chosen from the set 2α= 3{10α, 5 10α}.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as CPU or GPU models, memory, or cloud computing specifications.
Software Dependencies No The paper mentions names of methods and toolkits (e.g., 'SVDfeature', 'k-means') but does not specify any software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9', 'CUDA 11.1').
Experiment Setup No The paper states that parameters are selected from a given set (e.g., '{10α}2α= 3', '2α= 3{10α, 5 10α}') for the best recovery, but it does not specify the concrete hyperparameter values or other system-level training settings (e.g., learning rate, batch size, number of epochs) that were ultimately used for the reported results.