reproducibilityindex.ai

TANKBind: Trigonometry-Aware Neural NetworKs for Drug-Protein Binding Structure Prediction

Authors: Wei Lu, Qifeng Wu, Jixian Zhang, Jiahua Rao, Chengtao Li, Shuangjia Zheng

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments show substantial performance gains in comparison to state-of-the-art physics-based and deep learning-based methods on commonly-used benchmark datasets for both binding structure and affinity predictions with variant settings. We evaluate our algorithm against several state-of-the-art deep learning and physics-based docking methods on task of binding structure prediction under multiple settings. Table 1: Blind self-docking. All models take a pair of ligand structure (generated by RDKit) and protein structure as input, trying to predict the atom coordinates of the ligand after binding. In blind docking, the protein binding site is assumed unknown. Test set is composed of 363 protein-ligand structure crystallized after 2019 curated by PDBbind database.
Researcher Affiliation	Collaboration	Wei Lu Galixir Technologies Qifeng Wu Fudan University Jixian Zhang Galixir Technologies Jiahua Rao Sun Yat-sen University Chengtao Li Galixir Technologies Shuangjia Zheng Galixir Technologies Sun Yat-sen University Correspondance to {wei.lu, shuangjia.zheng}@galixir.com
Pseudocode	No	The paper describes the model architecture and training process in text and with diagrams (Figure 1 and 2), but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not explicitly state that its source code is open-sourced or provide a link to a code repository for the TANKBind methodology.
Open Datasets	Yes	We used publicly available PDBbind v2020 dataset Liu et al. [2015] which has the structures of 19443 protein-ligand complexes along with their experimentally measured binding affinity. PDBbind is a database curated based on the Protein Data Bank (PDB) Burley et al. [2021].
Dataset Splits	Yes	We followed the same time split as defined in Equi Bind paper Stärk et al. [2022] in which the training and validation data are the protein-ligand complex structures deposited before 2019 and the test set is the structures deposited after 2019. After removing a few structures that unable to process using RDKit from the training set, we had 17787 structures for training, 968 for validation and 363 for testing Landrum et al. [2013].
Hardware Specification	No	The paper states: 'We thank the Guangzhou National Supercomputer Center for providing computational source.' However, it does not specify any particular hardware details such as GPU models, CPU models, or memory specifications used for the experiments.
Software Dependencies	No	The paper mentions using 'Torch Drug toolkit' and 'RDKit', along with 'Kalign, Biopython, and Smith-Waterman library' but does not specify version numbers for these software dependencies, which are required for reproducibility.
Experiment Setup	Yes	Layernorm is applied on every input z(ℓ) ij and a 25% dropout is applied to the trigonometry update and self-attention modulation during training.