CSGNN: Contrastive Self-Supervised Graph Neural Network for Molecular Interaction Prediction
Authors: Chengshuai Zhao, Shuai Liu, Feng Huang, Shichao Liu, Wen Zhang
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on seven molecular interaction networks show that CSGNN outperforms classic and state-of-the-art models. Comprehensive experiments indicate that the mix-hop aggregator and the self-supervised regularizer can effectively facilitate the link inference in multifarious molecular networks. |
| Researcher Affiliation | Academia | College of Informatics, Huazhong Agricultural University, Wuhan, China {zhaochengshuai, shuailiu, fhuang233}@webmail.hzau.edu.cn, {scliu, zhangwen}@mail.hzau.edu.cn |
| Pseudocode | Yes | Algorithm 1 The Contrastive Self-Supervised Graph Neural Network Algorithm |
| Open Source Code | Yes | A reference implementation of CSGNN may be found at https://github.com/BioMedicalBigDataMiningLab/CSGNN or https://github.com/ChengshuaiZhao0/CSGNN |
| Open Datasets | Yes | We consider seven publicly available network datasets. (1) Ch G-Miner consists of 5,018 drugs which target 2,325 proteins via 15,139 drug-target interactions (DTIs). (2) Ch Ch Miner contains 48,514 interactions between 1,514 drugs (DDIs). (3) Hu RI-PPI includes 23,322 interactions among 5,604 proteins (PPIs) in HI-III network. (4) DG-Assoc Miner comprises 519 diseases which associate 7,294 genes through 2,1357 associations (DGIs). (5) DD-Miner is a dataset with 6,877 associations between 6,878 diseases (DIAs) (6) DCh-Miner contains 5,536 diseases associate 1,663 drugs via 466,657 associations (DDAs). (7) Ch Se-Decagon includes 639 drugs, 10,184 side-effects and 17,499 associations (DSAs) among them. We download DTIs, DDIs, DGIs, DIAs, DDAs, and DSAs from Bio SNAP [Marinka Zitnik and Leskovec, 2018], and collect PPIs from CCSB[Luck et al., 2020]. |
| Dataset Splits | Yes | In all experiments, each dataset is split into training, validation, and test sets as the ratio 7:1:2. |
| Hardware Specification | Yes | We run CSGNN and other compared methods on our workstation with 2 Intel(R) Xeon(R) Gold 6146 3.20GHZ CPUs, 128GB RAM, and 2 NVIDIA 1080 Ti GPUs. |
| Software Dependencies | No | The paper mentions employing GCN and GIN as message passing frameworks and provides a link to a GitHub repository. However, it does not specify explicit version numbers for programming languages or software libraries (e.g., Python version, PyTorch version). |
| Experiment Setup | Yes | In this paper, we set mix-hop K = 2 and we use the random distribution ranging from 0 to 1 to initiate the feature matrix X on 128 dimensions. We employ MLP with one hidden layer as the scoring function ρ( ). Concretely, the interaction prediction score can be defined as: ˆpuv = MLP (||(hu + hv, hu hv, hu, hv)) where is element-wise product. In the contrastive graph neural network, empirically, we randomly corrupt permutation of initial feature matrix X to X via corruption function Π( , ). Then, we select Mean as the readout function Γ( ), which is experimentally efficient for large-scale data. Therefore, the graph-level representation can be denoted by: s = Mean(H). Further, we instantiate the contrastive discriminator Ψ( , ) as σ h T W s . Here, we use sigmoid as the activation function to produce the score that represents probabilities of being a positive sample. Note that the graph neural networks which encode the original and corrupted network share the same parameters. In the joint training, we set α = 1, β = 0.1 and γ = 0.1. |