Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration

Authors: Haitao Lin, Yufei Huang, Odin Zhang, Yunfan Liu, Lirong Wu, Siyuan Li, Zhiyuan Chen, Stan Z. Li

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiments, our method can generate molecules with more realistic 3D structures, competitive affinities toward the protein targets, and better drug properties.
Researcher Affiliation Collaboration Haitao Lin Westlake University linhaitao@westlake.edu.cn Yufei Huang Westlake University huangyufei@westlake.edu.cn Odin Zhang Zhejiang University haotianzhang@zju.edu.cn Lirong Wu Westlake University wulirong@westlake.edu.cn Siyuan Li Westlake University lisiyuan@westlake.edu.cn Zhiyuan Chen Deep Potential chenzhiyuan@dp.tech Stan Z. Li Westlake University stan.zq.li@westlake.edu.cn
Pseudocode Yes Algorithm 1 Joint Generation for Molecules using D3FG
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes In the experiments, we use Cross Docked2020[32] for evaluation.
Dataset Splits Yes The datasets for training and evaluation are split according to POCKET2MOL [9] and TARGETDIFF [13]. 22.5 million docked protein binding complexes with low RMSD (< 1Å) and sequence identity less than 30% are selected, leading to 100,000 pairs of pocket-ligand complexes, with 100 novel complexes as references for evaluation.
Hardware Specification Yes We use a single NVIDIA A100(81920Mi B) GPU for a trial.
Software Dependencies Yes The codes are implemented in Python 3.9 mainly with Pytorch 1.12
Experiment Setup Yes In the diffusion of orientation and position, we employ a cosine variance schedule for αt, which reads αt = cos^2(π/2 * (t/T + s)/(1 + s)) / cos^2(π/2 * s/(1 + s)), where s = 0.01. In the diffusion of atom type, βt is set as βt = t/T. For the denoiser, the layer number is set as 6, and the embedding size is set as 256. The model is trained with Adam optimizer in 5000 epochs.