ProtGO: Function-Guided Protein Modeling for Unified Representation Learning
Authors: Bozhen Hu, Cheng Tan, Yongjie Xu, Zhangyang Gao, Jun Xia, Lirong Wu, Stan Z. Li
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Benchmark experiments highlight that Prot GO significantly outperforms state-of-the-art baselines, clearly demonstrating the advantages of the proposed unified framework. |
| Researcher Affiliation | Academia | 1Zhejiang University 2Westlake University {hubozhen, tancheng, stan.zq.li}@westlake.edu.cn |
| Pseudocode | No | The paper describes the message passing mechanism using mathematical formulas and descriptions (Eq. 4) but does not provide a separate, clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | Answer: [Yes] Justification: Codes are in the supplementary material. (from NeurIPS Paper Checklist) |
| Open Datasets | Yes | A dataset comprising approximately 30,000 proteins, each associated with 2,752 GO annotations from the GO dataset, is utilized without further categorization into biological process (BP), molecular function (MF), and cellular component (CC) classes [67]. |
| Dataset Splits | Yes | Table 4: Dataset statistics. #X means the number of X. Dataset #Train #Validation #Test ... Enzyme Commission 15, 550 1, 729 1, 919 ... Gene Ontology 29, 898 3, 322 3, 415 ... |
| Hardware Specification | Yes | The proposed framework conducted experiments on NVIDIA-SMI A100 GPUs and NVIDIA Tesla V100 GPUs, implemented with Py Torch 1.13+cu117 and Py Torch Geometric 2.3.1 with CUDA 11.2. |
| Software Dependencies | Yes | The proposed framework conducted experiments on NVIDIA-SMI A100 GPUs and NVIDIA Tesla V100 GPUs, implemented with Py Torch 1.13+cu117 and Py Torch Geometric 2.3.1 with CUDA 11.2. |
| Experiment Setup | Yes | The optimization is performed using the Adam optimizer through the Py Torch library, and the performance metrics are computed as mean values over three initializations. Further details regarding experimental settings are available in Appendix E.2. |