reproducibilityindex.ai

Scalable Bayesian Non-linear Matrix Completion

Authors: Xiangju Qin, Paul Blomstedt, Samuel Kaski

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	An empirical evaluation of the method, using simulations and a benchmark dataset, is given in Section 4. The paper ends with conclusions in Section 5. In this section, we evaluate the predictive performance of the proposed method for out-of-matrix prediction problems on simulated and real-world chemogenomic data, and compare it with two alternative approaches
Researcher Affiliation	Academia	Xiangju Qin , Paul Blomstedt and Samuel Kaski Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, 00076 Espoo, Finland xiangju.qin@helsinki.ﬁ, paul.blomstedt@aalto.ﬁ, samuel.kaski@aalto.ﬁ
Pseudocode	No	The paper describes computational strategies and methods but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	Our implementation is based on the GPy1 package. (Footnote: "1https://sheffieldml.github.io/GPy/") - This refers to a third-party package, not the authors' own implementation.
Open Datasets	Yes	We performed the experiments on Ex CAPE-DB data [Sun et al., 2017], which is an aggregation of public compound-target bioactivity data and describes interactions between drugs and targets using the p IC502 measure.
Dataset Splits	Yes	As the task is to perform out-of-matrix prediction, we randomly selected 20% of the rows as a test set, using the remaining rows as the training set. We used 3-fold cross validation to split the training and test set, where about 30% of the rows or compounds were chosen as test set in each fold.
Hardware Specification	No	Macau4 was run on compute nodes with 20 CPUs; all the other methods were run on a single CPU. (This specifies the number of CPUs but not the specific CPU model or other hardware details.)
Software Dependencies	No	Our implementation is based on the GPy1 package. and The dataset has 469 chem2vec features as side information which are generated from ECFP ﬁngerprint features for the compounds using word2vec software. and We ran the Macau version available in SMURFF software: https://github.com/Exa Science/smurff. None of these mention specific version numbers.
Experiment Setup	Yes	The experimental setting for MRD models is: number of inducing points 100, optimization through scaled conjugate gradients (SCG) with 500 iterations. For the So D approach, the latent variables were initialized with PPCA method. We ran Macau with Gibbs sampling for 1200 iterations, discarded the ﬁrst 800 samples as burn-in and saved every second of the remaining samples yielding in total 200 posterior samples. We set the dimension of latent variables K=10 for Ex CAPE-DB data, K=5 for simulated data for all methods.