reproducibilityindex.ai

Interpretable Drug Target Prediction Using Deep Neural Representation

Authors: Kyle Yingkai Gao, Achille Fokoue, Heng Luo, Arun Iyengar, Sanjoy Dey, Ping Zhang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally compared our model with matrix factorization, similarity-based methods, and a previous deep learning approach. Overall, the results show that our model outperforms other approaches without requiring domain knowledge and feature engineering.
Researcher Affiliation	Industry	Kyle Yingkai Gao, Achille Fokoue, Heng Luo, Arun Iyengar, Sanjoy Dey, Ping Zhang IBM Research AI, 1101 Kitchawan Road, Yorktown Heights, NY 10598 kyle.ygao@gmail.com, heng.luo@ibm.com, {achille, aruni, deysa, pzhang}@us.ibm.com
Pseudocode	Yes	Algorithm 1: Pseudocode of graph CNN.
Open Source Code	No	The paper does not explicitly state that its source code is open-sourced or provide a direct link to the implementation code. The link provided in footnote 2 (https://github.com/IBM/Interpretable DTIP) is for the dataset used, not the model's source code.
Open Datasets	Yes	Binding DB [Gilson et al., 2016] is a public, web-accessible database for medicinal chemistry, computational chemistry and systems pharmacology. We took a snapshot of Binding DB that contains 1.3 million data records...By the following criteria we construct a binary classiﬁcation dataset2 with 39,747 positive examples and 31,218 negative examples. (footnote 2: https://github.com/IBM/Interpretable DTIP)
Dataset Splits	Yes	We split proteins and drugs into those that should be observed in training and those that should not with four experimental settings; we then allocate DTI pairs into training, development, and testing datasets. Statistics of the datasets are shown in Table 1. Table 1: The number of distinct proteins, drugs, known positive pairs, and known negative pairs of the training, development, and testing datasets. Train... Dev... Test...
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions software like RDKit, LIBMF, Tiresias, and scikit-optimize but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	During training, the parameters are initialized randomly from an uniform distribution Θ ( 0.08, 0.08). In each step, with batch size equals 32, a batch of proteins or drugs is randomly selected from the training data. ...we use Adam gradient descent optimization with initial learning rate equals to 0.001 to train the parameters. We train the model for 30 epochs, where each epoch consists of 100 steps. ...The values of hyperparameters of the best model are shown in Table 2, and the best classiﬁcation boundary is δ = 0.4995. Table 2: Protein Sequence Embedding Size 16, Hidden Dimension 16, Embedding Dropout 0.1, GO Embedding Size 16, Embedding Dropout 0.1, Drug Graph CNN Hidden Dimension 64, Siamese Hidden Size 32, Dropout 0.1, γ 0.0005.