Generalized Protein Pocket Generation with Prior-Informed Flow Matching
Authors: ZAIXI ZHANG, Marinka Zitnik, Qi Liu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that Pocket Flow outperforms baselines on multiple benchmarks, e.g., achieving an average improvement of 1.29 in Vina Score and 0.05 in sc RMSD. Moreover, modeling interactions make Pocket Flow a generalized generative model across multiple ligand modalities, including small molecules, peptides, and RNA. |
| Researcher Affiliation | Academia | 1: School of Computer Science and Technology, University of Science and Technology of China 2:State Key Laboratory of Cognitive Intelligence, Hefei, Anhui, China 3:Harvard University |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | The code is provided at https://github.com/zaixizhang/Pocket Flow. |
| Open Datasets | Yes | Following previous works [29, 71, 92] we consider two widely used protein-small molecule binding datasets for experimental evaluations: Cross Docked dataset [27]... Binding MOAD dataset [34]... To test the generalizability of Pocket Flow to other ligand modalities, we further consider PPDBench [3], which contains 133 non-redundant complexes of protein-peptides and PDBBind RNA [80]... |
| Dataset Splits | Yes | Cross Docked dataset [27] is generated through crossdocking and is split with mmseqs2 [75] at 30% sequence identity, leading to train/val/test set of 100k/100/100 complexes. Binding MOAD dataset [34]... resulting in 40k protein-small molecule pairs for training, 100 pairs for validation, and 100 pairs for testing. |
| Hardware Specification | Yes | All the baselines are run on the same Tesla A100 GPU. ... We train on a Tesla A100 GPU for 20 epochs. |
| Software Dependencies | No | The paper mentions software like 'Open Babel' and 'Adam optimizer' but does not specify their version numbers. |
| Experiment Setup | Yes | In Pocket Flow, the number of network blocks is set to 8, the number of transformer layers within each block is set to 4, and the number of hidden channels used in the IPA calculation is set to 16. The node embedding size Dh and the edge embedding size Dz are set as 128. We removed skip connections and psi-angle prediction. For model training, we use Adam [45] optimizer with learning rate 0.0001, β1 = 0.9, β2 = 0.999. We train on a Tesla A100 GPU for 20 epochs. In the sampling process, the total number of steps T is set as 50. γ, ξ1, ξ2, and ξ3 are set as 1 in the default setting. |