Protein-Ligand Interaction Prior for Binding-aware 3D Molecule Diffusion Models
Authors: Zhilin Huang, Ling Yang, Xiangxin Zhou, Zhilong Zhang, Wentao Zhang, Xiawu Zheng, Jie Chen, Yu Wang, Bin CUI, Wenming Yang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies on Cross Docked2020 dataset show IPDIFF can generate molecules with more realistic 3D structures and state-of-the-art binding affinities towards the protein targets, with up to -6.42 Avg. Vina Score, while maintaining proper molecular properties. https://github.com/Yang Ling0818/IPDiff |
| Researcher Affiliation | Academia | 1 Shenzhen International Graduate School, Tsinghua University 2 Peng Cheng Laboratory 3 Peking University 4 University of Chinese Academy of Sciences 5 Xiamen University |
| Pseudocode | Yes | Algorithm 1 Training Procedure of IPDIFF |
| Open Source Code | Yes | https://github.com/Yang Ling0818/IPDiff |
| Open Datasets | Yes | For molecular generation, following the previous work Luo et al. (2021); Peng et al. (2022); Guan et al. (2023a), we train and evaluate IPDIFF on the Cross Docked2020 dataset (Francoeur et al., 2020). |
| Dataset Splits | No | The paper states '100,000 protein-ligand pairs are utilized for training and 100 proteins for testing' for the Cross Docked2020 dataset but does not explicitly provide details for a validation split. |
| Hardware Specification | Yes | We train IPNET on a single NVIDIA V100 GPU, and we use the Adam as our optimizer with learning rate 0.001, betas = (0.95, 0.999), batch size 16. |
| Software Dependencies | No | The paper mentions 'Adam as our optimizer' and 'AutoDock Vina', but does not provide specific version numbers for software dependencies or libraries used for implementation. |
| Experiment Setup | Yes | Following Guan et al. (2023a), we use the Adam as our optimizer with learning rate 0.001, betas = (0.95, 0.999), batch size 4 and clipped gradient norm 8. We balance the atom type loss and atom position loss by multiplying a scaling factor λ = 100 on the atom type loss. During the training phase, we add a small Gaussian noise with a standard deviation of 0.1 to protein atom coordinates as data augmentation. |