reproducibilityindex.ai

Scalable Probabilistic Matrix Factorization with Graph-Based Priors

Authors: Jonathan Strahl, Jaakko Peltonen, Hirsohi Mamitsuka, Samuel Kaski5851-5858

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On real data experiments we demonstrate improved prediction accuracy with fewer graph edges...We compare our algorithm to a baseline with no graph SI (PMF, (Mnih and Salakhutdinov 2008)), the current most scalable method, GRALS (Rao et al. 2015), and to evaluate accuracy less scalable methods KPMF (Zhou et al. 2012) and s RMGCNN (Monti, Bronstein, and Bresson 2017).Table 1: Result summary on real datasets (RMSE)
Researcher Affiliation	Academia	Jonathan Strahl,1 Jaakko Peltonen,2 Hiroshi Mamitsuka,1,3 Samuel Kaski1 1Department of Computer Science, Aalto University, Finland 2Tampere University, Faculty of Information Technology and Communication Science, Finland 3Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan
Pseudocode	Yes	Algorithm 1 Graph-regularised alternating EM (GRAEM)
Open Source Code	Yes	Code: https://github.com/strahl2e/GPMF-GBP-AAAI-20
Open Datasets	Yes	We compare our algorithm to a baseline with no graph SI (PMF, (Mnih and Salakhutdinov 2008)), the current most scalable method, GRALS (Rao et al. 2015), and to evaluate accuracy less scalable methods KPMF (Zhou et al. 2012) and s RMGCNN (Monti, Bronstein, and Bresson 2017).Table 1: Result summary on real datasets (RMSE)
Dataset Splits	No	No explicit training/validation/test dataset splits (e.g., percentages, sample counts, or predefined split references) are provided.
Hardware Specification	No	The paper mentions 'A 300 thousand dimensional graph with three million edges (Yahoo music sideinformation) can be analyzed in under ten minutes on a standard laptop computer', which is too vague for specific hardware. It also notes 'For s RMGCNN we used their published code, ran it on a (NVIDIA Tesla P100) GPU', but this refers to a baseline's execution, not the authors' primary experimental setup.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., programming language versions, library versions, or solver versions) are provided.
Experiment Setup	Yes	We generate a 400x400 data matrix by Equations (1)-(3), with proportion of corrupted edges 0.3, observation noise 0.01, 7% observed values, and 40 latent dimensions; we vary these settings in the experiments below.The threshold parameter τ is set to zero (or can be increased for a sparser solution).We initialize the latent feature matrices (U, V ) by ﬁnding the MAP with no graph SI using PMF