Protein Multimer Structure Prediction via Prompt Learning

Authors: Ziqi Gao, Xiangguo Sun, Zijing Liu, Yu Li, Hong Cheng, Jia Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we achieve both significant accuracy (RMSD and TM-Score) and efficiency improvements compared to advanced MSP models. The paper extensively details experiments in Section 5, including datasets, baselines, and performance metrics, confirming its experimental nature.
Researcher Affiliation Collaboration Ziqi Gao1,2 , Xiangguo Sun3, Zijing Liu4, Yu Li4, Hong Cheng3, Jia Li1,2 1Hong Kong University of Science and Technology (Guangzhou) 2Hong Kong University of Science and Technology 3The Chinese University of Hong Kong 4IDEA Research, International Digital Economy Academy. This shows a mix of university affiliations (Hong Kong University of Science and Technology, The Chinese University of Hong Kong) and a research academy (IDEA Research), indicating collaboration.
Pseudocode Yes The paper includes clearly labeled algorithm blocks in Appendix A.1: 'Algorithm 1 Formation of Esou.', 'Algorithm 2 Calculation of ysou.', and 'Algorithm 3 Preparation for Dtar.'.
Open Source Code Yes The code, data and checkpoints are released at https://github.com/zqgao22/Prompt MSP.
Open Datasets Yes We collect all publicly available multimers (3 N 30) from the Protein Data Bank (PDB) database (Berman et al., 2000) on 2023-02-20.
Dataset Splits Yes Table 5: Statistics of PDB-M N Train Valid Test 3 1325 265 10 4 942 188 10 5 981 196 10 6-10 3647 730 50 11-15 267 53 25 16-20 198 40 25 21-25 135 27 25 26-30 66 14 25 Total 7561 1513 180. This table explicitly shows the number of samples for Train, Valid, and Test sets for different chain numbers.
Hardware Specification Yes We run all methods on 2 A100 SXM4 40GB GPUs and consider exceeding the memory limit or the resource of 10 GPU hours as failures, which are padded by the upper bound performance of all baselines.
Software Dependencies No The paper provides a table of hyperparameters but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Table 4: Hyperparameter choices of PROMPTMSP. Hyperparameters Values Embedding function dimension (input) 13 GIN layer number K 2 Dimension of MLP in Eq. 8 1024, 1024 Dimension of ϕ in Eq. 1 256, 1 Dropout rate 0.2 Number of attention head 4 Source/target batch-size 512, 512 Source/target learning rates 0.01, 0.001 Task head layer number 2 Task head dimension 256, 1 Optimizer Adam.