De Novo Molecular Generation via Connection-aware Motif Mining

Authors: Zijie Geng, Shufang Xie, Yingce Xia, Lijun Wu, Tao Qin, Jie Wang, Yongdong Zhang, Feng Wu, Tie-Yan Liu

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our method on distribution-learning benchmarks (i.e., generating novel molecules to resemble the distribution of a given training set) and goal-directed benchmarks (i.e., generating molecules with target properties), and achieve significant improvements over previous fragment-based baselines.
Researcher Affiliation Collaboration Zijie Geng1 , Shufang Xie2 , Yingce Xia3 , Lijun Wu3, Tao Qin3, Jie Wang1,4 , Yongdong Zhang1, Feng Wu1, Tie-Yan Liu3 1 University of Science and Technology of China ustcgzj@mail.ustc.edu.cn, {jiewangx, zhyd73, fengwu}@ustc.edu.cn 2 Gaoling School of Artificial Intelligence, Renmin University of China shufangxie@ruc.edu.cn 3 Microsoft Research AI4Science {yingce.xia, lijunwu, taoqin, tyliu}@microsoft.com 4 Institute of Artificial Intelligence, Hefei Comprehensive National Science Center
Pseudocode Yes Algorithm 1: Connection-awared Motif Mining; Algorithm 2: Generating a molecule
Open Source Code Yes The code of Mi Ca M is available at https://github.com/MIRALab-USTC/AI4Sci-Mi Ca M.
Open Datasets Yes We evaluate our method on three datasets: QM9 (Ruddigkeit et al., 2012), ZINC (Irwin et al., 2012), and Guaca Mol (a post-processed Ch EMBL (Mendez et al., 2019) dataset proposed by Brown et al. (2019)).
Dataset Splits No βprior and βprop are hyperparameters to be determined according to validation performances.
Hardware Specification Yes We measure the training and sampling speed on a single Ge Force RTX 3090.
Software Dependencies No We employ GINE (Hu et al., 2019) as the GNN structures... The target values are computed using the RDKit library.
Experiment Setup Yes For QM9, we use a short warm-up (3, 000 steps), and use a long sigmoid schedule (400, 000 steps) (Bowman et al., 2015) to let βprior to reach 0.4. ... a small βprop (about 0.3) is beneficial.