DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design
Authors: Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, Quanquan Gu
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on Cross Docked2020 show that our approach achieves state-of-the-art performance in generating high-affinity molecules while maintaining proper molecular properties and conformational stability, with up to 8.39 Avg. Vina Dock score and 24.5% Success Rate. ... 4 Experiments 4.1 Experimental Setup |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, University of Illinois Urbana-Champaign, USA 2Byte Dance Research (Work was done during Jiaqi s and Xiangxin s internship at Byte Dance) 3School of Artificial Intelligence, University of Chinese Academy of Sciences 4Center for Research on Intelligent Perception and Computing (CRIPAC), State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences (CASIA) 5Institute for AI Industry Research, Tsinghua University, Beijing, China. |
| Pseudocode | Yes | Algorithm 1 Procedure of filtering beta-clusters and determining scaffold and arm priors. |
| Open Source Code | Yes | The code is provided at https://github. com/bytedance/Decomp Diff |
| Open Datasets | Yes | Following the previous work (Luo et al., 2021; Peng et al., 2022), we trained our model on the Cross Docked2020 dataset (Francoeur et al., 2020). |
| Dataset Splits | Yes | We use the same dataset preprocessing and splitting procedure as Luo et al. (2021), where the 22.5 million docked binding complexes are first refined to only keep high-quality docking poses (RMSD between the docked pose and the ground truth < 1 A) and diverse proteins (sequence identity < 30%), and then 100, 000 complexes are selected for training and 100 novel proteins are selected as references for testing. |
| Hardware Specification | Yes | We trained our model on one NVIDIA Ge Force GTX A100 GPU, and it could converge within 36 hours and 300k steps. |
| Software Dependencies | No | The paper mentions using the 'Adam' optimizer and describes its neural network architecture, but it does not specify software library versions (e.g., Python 3.x, PyTorch 1.x, CUDA x.x). |
| Experiment Setup | Yes | We set the number of diffusion steps as 1000. ... The model is trained via gradient descent method Adam (Kingma & Ba, 2014) with init learning rate=0.001, betas=(0.95, 0.999), batch size=4 and clip gradient norm=8. To balance the scales of different losses, we multiply a factor α = 100 on the atom type loss and bond type loss. ... We also schedule to decay the learning rate exponentially with a factor of 0.6 and a minimum learning rate of 1e-6. |