Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
CSGNN: Contrastive Self-Supervised Graph Neural Network for Molecular Interaction Prediction
Authors: Chengshuai Zhao, Shuai Liu, Feng Huang, Shichao Liu, Wen Zhang
IJCAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on seven molecular interaction networks show that CSGNN outperforms classic and state-of-the-art models. Comprehensive experiments indicate that the mix-hop aggregator and the self-supervised regularizer can effectively facilitate the link inference in multifarious molecular networks. |
| Researcher Affiliation | Academia | College of Informatics, Huazhong Agricultural University, Wuhan, China EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1 The Contrastive Self-Supervised Graph Neural Network Algorithm |
| Open Source Code | Yes | A reference implementation of CSGNN may be found at https://github.com/BioMedicalBigDataMiningLab/CSGNN or https://github.com/ChengshuaiZhao0/CSGNN |
| Open Datasets | Yes | We consider seven publicly available network datasets. (1) Ch G-Miner consists of 5,018 drugs which target 2,325 proteins via 15,139 drug-target interactions (DTIs). (2) Ch Ch Miner contains 48,514 interactions between 1,514 drugs (DDIs). (3) Hu RI-PPI includes 23,322 interactions among 5,604 proteins (PPIs) in HI-III network. (4) DG-Assoc Miner comprises 519 diseases which associate 7,294 genes through 2,1357 associations (DGIs). (5) DD-Miner is a dataset with 6,877 associations between 6,878 diseases (DIAs) (6) DCh-Miner contains 5,536 diseases associate 1,663 drugs via 466,657 associations (DDAs). (7) Ch Se-Decagon includes 639 drugs, 10,184 side-effects and 17,499 associations (DSAs) among them. We download DTIs, DDIs, DGIs, DIAs, DDAs, and DSAs from Bio SNAP [Marinka Zitnik and Leskovec, 2018], and collect PPIs from CCSB[Luck et al., 2020]. |
| Dataset Splits | Yes | In all experiments, each dataset is split into training, validation, and test sets as the ratio 7:1:2. |
| Hardware Specification | Yes | We run CSGNN and other compared methods on our workstation with 2 Intel(R) Xeon(R) Gold 6146 3.20GHZ CPUs, 128GB RAM, and 2 NVIDIA 1080 Ti GPUs. |
| Software Dependencies | No | The paper mentions employing GCN and GIN as message passing frameworks and provides a link to a GitHub repository. However, it does not specify explicit version numbers for programming languages or software libraries (e.g., Python version, PyTorch version). |
| Experiment Setup | Yes | In this paper, we set mix-hop K = 2 and we use the random distribution ranging from 0 to 1 to initiate the feature matrix X on 128 dimensions. We employ MLP with one hidden layer as the scoring function ρ( ). Concretely, the interaction prediction score can be defined as: ˆpuv = MLP (||(hu + hv, hu hv, hu, hv)) where is element-wise product. In the contrastive graph neural network, empirically, we randomly corrupt permutation of initial feature matrix X to X via corruption function Π( , ). Then, we select Mean as the readout function Γ( ), which is experimentally efficient for large-scale data. Therefore, the graph-level representation can be denoted by: s = Mean(H). Further, we instantiate the contrastive discriminator Ψ( , ) as σ h T W s . Here, we use sigmoid as the activation function to produce the score that represents probabilities of being a positive sample. Note that the graph neural networks which encode the original and corrupted network share the same parameters. In the joint training, we set α = 1, β = 0.1 and γ = 0.1. |