reproducibilityindex.ai

MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein Embedding

Authors: Lirong Wu, Yijun Tian, Yufei Huang, Siyuan Li, Haitao Lin, Nitesh V Chawla, Stan Z. Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show that MAPE-PPI can scale to PPI prediction with millions of PPIs with superior trade-offs between effectiveness and computational efficiency than the state-of-the-art competitors.
Researcher Affiliation	Academia	1Westlake University, 2Zhejiang University, 3University of Notre Dame
Pseudocode	Yes	The pseudo-code of the proposed MAPE-PPI framework is summarized in Algorithm 1.
Open Source Code	Yes	Codes are available at: https://github.com/Lirong Wu/MAPE-PPI.
Open Datasets	Yes	The STRING dataset contains 1,150,830 PPI entries of Homo sapiens from the STRING database (Szklarczyk et al., 2019)... Moreover, we apply Alphafold2 (Jumper et al., 2021) to predict the 3D structures of all protein sequence data.
Dataset Splits	Yes	We split the PPIs into the training (60%), validation (20%), and testing (20%) for all baselines.
Hardware Specification	Yes	The experiments on both baselines and our approach are implemented based on the standard implementation using the Py Torch 1.6.0 with Intel(R) Xeon(R) Gold 6240R @ 2.40GHz CPU and 8 NVIDIA A100 GPUs.
Software Dependencies	Yes	The experiments on both baselines and our approach are implemented based on the standard implementation using the Py Torch 1.6.0
Experiment Setup	Yes	The following hyperparameters are set the same for all datasets and partitions: PPI encoder (GIN) with layer number Ls = 2 and hidden dimension 1024, learning rate lr = 0.001, weight decay decay = 1e 4, loss weight β = 0.25, pre-training epoch Epre = 50, PPI training epoch E = 500, thresholds ds = 2, dr = 10 A, and neighbor number K = 5. The other dataset-specific hyperparameters are determined by an Auto ML toolkit NNI with the hyperparameter search spaces as: protein encoder with layer number L = {4, 5} and hidden dimension F = {128, 256}, codebook size \|A\| = {256, 512, 1024}, mask ratio \|M\| / \|A\| = {0.1, 0.15, 0.2}, scaling factor γ = {1, 1.5, 2.0}, and loss weight η = {0.5, 1.0}.