reproducibilityindex.ai

DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

Authors: Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, Quanquan Gu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on Cross Docked2020 show that our approach achieves state-of-the-art performance in generating high-affinity molecules while maintaining proper molecular properties and conformational stability, with up to 8.39 Avg. Vina Dock score and 24.5% Success Rate. ... 4 Experiments 4.1 Experimental Setup
Researcher Affiliation	Collaboration	1Department of Computer Science, University of Illinois Urbana-Champaign, USA 2Byte Dance Research (Work was done during Jiaqi s and Xiangxin s internship at Byte Dance) 3School of Artificial Intelligence, University of Chinese Academy of Sciences 4Center for Research on Intelligent Perception and Computing (CRIPAC), State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences (CASIA) 5Institute for AI Industry Research, Tsinghua University, Beijing, China.
Pseudocode	Yes	Algorithm 1 Procedure of filtering beta-clusters and determining scaffold and arm priors.
Open Source Code	Yes	The code is provided at https://github. com/bytedance/Decomp Diff
Open Datasets	Yes	Following the previous work (Luo et al., 2021; Peng et al., 2022), we trained our model on the Cross Docked2020 dataset (Francoeur et al., 2020).
Dataset Splits	Yes	We use the same dataset preprocessing and splitting procedure as Luo et al. (2021), where the 22.5 million docked binding complexes are first refined to only keep high-quality docking poses (RMSD between the docked pose and the ground truth < 1 A) and diverse proteins (sequence identity < 30%), and then 100, 000 complexes are selected for training and 100 novel proteins are selected as references for testing.
Hardware Specification	Yes	We trained our model on one NVIDIA Ge Force GTX A100 GPU, and it could converge within 36 hours and 300k steps.
Software Dependencies	No	The paper mentions using the 'Adam' optimizer and describes its neural network architecture, but it does not specify software library versions (e.g., Python 3.x, PyTorch 1.x, CUDA x.x).
Experiment Setup	Yes	We set the number of diffusion steps as 1000. ... The model is trained via gradient descent method Adam (Kingma & Ba, 2014) with init learning rate=0.001, betas=(0.95, 0.999), batch size=4 and clip gradient norm=8. To balance the scales of different losses, we multiply a factor α = 100 on the atom type loss and bond type loss. ... We also schedule to decay the learning rate exponentially with a factor of 0.6 and a minimum learning rate of 1e-6.