E3Bind: An End-to-End Equivariant Network for Protein-Ligand Docking
Authors: Yangtian Zhang, Huiyu Cai, Chence Shi, Jian Tang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on standard benchmark datasets demonstrate the superior performance of our end-to-end trainable model compared to traditional and recently-proposed deep learning methods. |
| Researcher Affiliation | Academia | Yangtian Zhang1,2 , Huiyu Cai1,2 , Chence Shi1,2, Jian Tang1,3,4 1Mila Qu ebec AI Institute, Canada 2Universit e de Montr eal, Canada 3HEC Montr eal, Canada 4CIFAR AI Research Chair |
| Pseudocode | Yes | Algorithm 1: Trioformer Block |
| Open Source Code | No | All code for data preprocessing, training and inference will be publicly released upon acceptance. |
| Open Datasets | Yes | We use the PDBbind v2020 dataset (Liu et al., 2017) for training and evaluation. |
| Dataset Splits | Yes | We follow the time dataset split from (St ark et al., 2022), where 363 complex structures uploaded later than 2019 serve as test examples. After removing structures sharing ligands with the test set, the remaining 16739 structures are used for training and 968 structures are used for validation. |
| Hardware Specification | No | Table S4 provides an 'Avg. Sec.' column for '16-CPU' and 'GPU' in the inference speed section, but no specific models (e.g., NVIDIA A100, Intel Xeon) or detailed hardware configurations are mentioned for the experiments conducted by the authors. |
| Software Dependencies | No | The paper mentions using "Torch Drug (Zhu et al., 2022)" and "RDKit (Landrum et al., 2013)", but it does not specify version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The dimensions of protein, ligand and pair embeddings are set to 128. Layer normalization and dropout are applied in each block. For the implementation of EGCL, we use the normalized coordinate differences instead of the original coordinate differences. Specifically, all coordinates are divided by 5 before fed into EGCL and the final coordinates are multiplied by 5. Si LU activation function is used in the EGCL layers and Leaky Re LU is used in the Trioformer blocks. All models including variants in ablation study are trained with Adam optimizer with learning rate 0.0001 for 300 epochs. The model with the best valid score (measured by the fraction of predicted pose with RMSD < 2 A) is evaluated on the test set. |