Interpretable Drug Target Prediction Using Deep Neural Representation
Authors: Kyle Yingkai Gao, Achille Fokoue, Heng Luo, Arun Iyengar, Sanjoy Dey, Ping Zhang
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally compared our model with matrix factorization, similarity-based methods, and a previous deep learning approach. Overall, the results show that our model outperforms other approaches without requiring domain knowledge and feature engineering. |
| Researcher Affiliation | Industry | Kyle Yingkai Gao, Achille Fokoue, Heng Luo, Arun Iyengar, Sanjoy Dey, Ping Zhang IBM Research AI, 1101 Kitchawan Road, Yorktown Heights, NY 10598 kyle.ygao@gmail.com, heng.luo@ibm.com, {achille, aruni, deysa, pzhang}@us.ibm.com |
| Pseudocode | Yes | Algorithm 1: Pseudocode of graph CNN. |
| Open Source Code | No | The paper does not explicitly state that its source code is open-sourced or provide a direct link to the implementation code. The link provided in footnote 2 (https://github.com/IBM/Interpretable DTIP) is for the dataset used, not the model's source code. |
| Open Datasets | Yes | Binding DB [Gilson et al., 2016] is a public, web-accessible database for medicinal chemistry, computational chemistry and systems pharmacology. We took a snapshot of Binding DB that contains 1.3 million data records...By the following criteria we construct a binary classification dataset2 with 39,747 positive examples and 31,218 negative examples. (footnote 2: https://github.com/IBM/Interpretable DTIP) |
| Dataset Splits | Yes | We split proteins and drugs into those that should be observed in training and those that should not with four experimental settings; we then allocate DTI pairs into training, development, and testing datasets. Statistics of the datasets are shown in Table 1. Table 1: The number of distinct proteins, drugs, known positive pairs, and known negative pairs of the training, development, and testing datasets. Train... Dev... Test... |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions software like RDKit, LIBMF, Tiresias, and scikit-optimize but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | During training, the parameters are initialized randomly from an uniform distribution Θ ( 0.08, 0.08). In each step, with batch size equals 32, a batch of proteins or drugs is randomly selected from the training data. ...we use Adam gradient descent optimization with initial learning rate equals to 0.001 to train the parameters. We train the model for 30 epochs, where each epoch consists of 100 steps. ...The values of hyperparameters of the best model are shown in Table 2, and the best classification boundary is δ = 0.4995. Table 2: Protein Sequence Embedding Size 16, Hidden Dimension 16, Embedding Dropout 0.1, GO Embedding Size 16, Embedding Dropout 0.1, Drug Graph CNN Hidden Dimension 64, Siamese Hidden Size 32, Dropout 0.1, γ 0.0005. |