TANKBind: Trigonometry-Aware Neural NetworKs for Drug-Protein Binding Structure Prediction
Authors: Wei Lu, Qifeng Wu, Jixian Zhang, Jiahua Rao, Chengtao Li, Shuangjia Zheng
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show substantial performance gains in comparison to state-of-the-art physics-based and deep learning-based methods on commonly-used benchmark datasets for both binding structure and affinity predictions with variant settings. We evaluate our algorithm against several state-of-the-art deep learning and physics-based docking methods on task of binding structure prediction under multiple settings. Table 1: Blind self-docking. All models take a pair of ligand structure (generated by RDKit) and protein structure as input, trying to predict the atom coordinates of the ligand after binding. In blind docking, the protein binding site is assumed unknown. Test set is composed of 363 protein-ligand structure crystallized after 2019 curated by PDBbind database. |
| Researcher Affiliation | Collaboration | Wei Lu Galixir Technologies Qifeng Wu Fudan University Jixian Zhang Galixir Technologies Jiahua Rao Sun Yat-sen University Chengtao Li Galixir Technologies Shuangjia Zheng Galixir Technologies Sun Yat-sen University Correspondance to {wei.lu, shuangjia.zheng}@galixir.com |
| Pseudocode | No | The paper describes the model architecture and training process in text and with diagrams (Figure 1 and 2), but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not explicitly state that its source code is open-sourced or provide a link to a code repository for the TANKBind methodology. |
| Open Datasets | Yes | We used publicly available PDBbind v2020 dataset Liu et al. [2015] which has the structures of 19443 protein-ligand complexes along with their experimentally measured binding affinity. PDBbind is a database curated based on the Protein Data Bank (PDB) Burley et al. [2021]. |
| Dataset Splits | Yes | We followed the same time split as defined in Equi Bind paper Stärk et al. [2022] in which the training and validation data are the protein-ligand complex structures deposited before 2019 and the test set is the structures deposited after 2019. After removing a few structures that unable to process using RDKit from the training set, we had 17787 structures for training, 968 for validation and 363 for testing Landrum et al. [2013]. |
| Hardware Specification | No | The paper states: 'We thank the Guangzhou National Supercomputer Center for providing computational source.' However, it does not specify any particular hardware details such as GPU models, CPU models, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions using 'Torch Drug toolkit' and 'RDKit', along with 'Kalign, Biopython, and Smith-Waterman library' but does not specify version numbers for these software dependencies, which are required for reproducibility. |
| Experiment Setup | Yes | Layernorm is applied on every input z(ℓ) ij and a 25% dropout is applied to the trigonometry update and self-attention modulation during training. |