Scalable Probabilistic Matrix Factorization with Graph-Based Priors

Authors: Jonathan Strahl, Jaakko Peltonen, Hirsohi Mamitsuka, Samuel Kaski5851-5858

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On real data experiments we demonstrate improved prediction accuracy with fewer graph edges...We compare our algorithm to a baseline with no graph SI (PMF, (Mnih and Salakhutdinov 2008)), the current most scalable method, GRALS (Rao et al. 2015), and to evaluate accuracy less scalable methods KPMF (Zhou et al. 2012) and s RMGCNN (Monti, Bronstein, and Bresson 2017).Table 1: Result summary on real datasets (RMSE)
Researcher Affiliation Academia Jonathan Strahl,1 Jaakko Peltonen,2 Hiroshi Mamitsuka,1,3 Samuel Kaski1 1Department of Computer Science, Aalto University, Finland 2Tampere University, Faculty of Information Technology and Communication Science, Finland 3Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan
Pseudocode Yes Algorithm 1 Graph-regularised alternating EM (GRAEM)
Open Source Code Yes Code: https://github.com/strahl2e/GPMF-GBP-AAAI-20
Open Datasets Yes We compare our algorithm to a baseline with no graph SI (PMF, (Mnih and Salakhutdinov 2008)), the current most scalable method, GRALS (Rao et al. 2015), and to evaluate accuracy less scalable methods KPMF (Zhou et al. 2012) and s RMGCNN (Monti, Bronstein, and Bresson 2017).Table 1: Result summary on real datasets (RMSE)
Dataset Splits No No explicit training/validation/test dataset splits (e.g., percentages, sample counts, or predefined split references) are provided.
Hardware Specification No The paper mentions 'A 300 thousand dimensional graph with three million edges (Yahoo music sideinformation) can be analyzed in under ten minutes on a standard laptop computer', which is too vague for specific hardware. It also notes 'For s RMGCNN we used their published code, ran it on a (NVIDIA Tesla P100) GPU', but this refers to a baseline's execution, not the authors' primary experimental setup.
Software Dependencies No No specific software dependencies with version numbers (e.g., programming language versions, library versions, or solver versions) are provided.
Experiment Setup Yes We generate a 400x400 data matrix by Equations (1)-(3), with proportion of corrupted edges 0.3, observation noise 0.01, 7% observed values, and 40 latent dimensions; we vary these settings in the experiments below.The threshold parameter τ is set to zero (or can be increased for a sparser solution).We initialize the latent feature matrices (U, V ) by finding the MAP with no graph SI using PMF