PoET: A generative model of protein families as sequences-of-sequences

Authors: Timothy Truong Jr, Tristan Bepler

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In extensive experiments on deep mutational scanning datasets, we show that Po ET outperforms existing protein language models and evolutionary sequence models for variant function prediction across proteins of all MSA depths.
Researcher Affiliation Industry Timothy F. Truong Jr Open Protein.AI NY, USA ttruong@openprotein.ai Tristan Bepler Open Protein.AI NY, USA tbepler@openprotein.ai
Pseudocode Yes Algorithm 1 Tiered Transformer Decoder Layer
Open Source Code Yes Code and pre-trained model weights are available at https://github.com/Open Protein AI/Po ET.
Open Datasets Yes Models were trained on 29 million sets of homologous sequences. Each set corresponds to a sequence in Uni Ref50 Version 2103, and contains all its homologs in Uni Ref50 found using Diamond [34]. We evalaute Po ET on Protein Gym [11], the largest collection of such data yet, containing 87 datasets with substitution variants and 7 datasets with indel variants.
Dataset Splits Yes We use the same validation set as Notin et al. [11] for tuning hyperparameters.
Hardware Specification Yes We trained 57M parameter versions of Po ET for up to 3 days on 7 x A100 GPUs with three context lengths: 4K, 8K, and 16K.
Software Dependencies No The paper mentions several software tools like Diamond, Jack HMMer, MMseqs2, Colab Fold, Alpha Fold2, and MAFFT, but does not provide specific version numbers for these software dependencies, nor for other libraries or programming languages used.
Experiment Setup Yes We trained 57M parameter versions of Po ET for up to 3 days on 7 x A100 GPUs with three context lengths: 4K, 8K, and 16K. We used the Ada Factor optimizer [40] with initial learning rate 1e-2, square root learning rate decay, and otherwise default parameters. Hyperparameters for Po ET variations used in ablation experiments ( 5.2.1) are summarized in Table 3.