ProtST: Multi-Modality Learning of Protein Sequences and Biomedical Texts
Authors: Minghao Xu, Xinyu Yuan, Santiago Miret, Jian Tang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We verify the superiority of Prot ST-induced PLMs over previous ones on diverse representation learning benchmarks. ... We investigate the PLMs trained under Prot ST by representation learning and zero-shot prediction. For representation learning, we verify their superior performance over previous masked language modeling and knowledge-enhanced PLMs on 11 standard benchmarks for protein localization prediction, fitness landscape prediction and protein function annotation (Sec. 4.2). |
| Researcher Affiliation | Collaboration | 1Mila Qu ebec AI Institute 2Universit e de Montr eal 3Intel Labs 4HEC Montr eal 5CIFAR AI Research Chair. Correspondence to: Minghao Xu <minghao.xu@mila.quebec>, Santiago Miret <santiago.miret@intel.com>, Jian Tang <jian.tang.ca>. |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block was found in the paper. Methods are described in prose and diagrams. |
| Open Source Code | Yes | Source code and model weights are available at https://github. com/Deep Graph Learning/Prot ST. |
| Open Datasets | Yes | To inject protein property information into PLMs, we build the Prot Describe dataset with 553,052 aligned pairs of protein sequence and property description. Specifically, we employ the Swiss-Prot (Bairoch & Apweiler, 2000) database to provide annotations of various protein properties... |
| Dataset Splits | Yes | For all models on all tasks, we select the checkpoint for evaluation based on the validation set performance, and all results are reported on the seed 0. |
| Hardware Specification | Yes | An Adam optimizer (Kingma & Ba, 2014) (learning rate: 1.0 × 10−5, weight decay: 0) is used to train the whole model for 20 epochs on 4 Tesla V100 GPUs. |
| Software Dependencies | No | The paper mentions using "Adam optimizer" and "Torch Drug" but does not provide specific version numbers for these software components or other libraries used in the implementation. |
| Experiment Setup | Yes | An Adam optimizer (Kingma & Ba, 2014) (learning rate: 1.0 × 10−5, weight decay: 0) is used to train the whole model for 20 epochs on 4 Tesla V100 GPUs. ... Prot ST-Prot Bert adopts the batch size of 16 (4 proteins per GPU), and Prot ST-ESM-1b and Prot ST-ESM-2 adopt the batch size of 12 (3 proteins per GPU). ... We truncate the protein sequences that have more than 450 residues to the length of 450, where the truncation starts from a random residue before the last 450 ones. ... we initialize the temperature parameter τ in Eq. (1) as 0.07 and optimize it along the training process. |