De Novo Molecular Generation via Connection-aware Motif Mining
Authors: Zijie Geng, Shufang Xie, Yingce Xia, Lijun Wu, Tao Qin, Jie Wang, Yongdong Zhang, Feng Wu, Tie-Yan Liu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our method on distribution-learning benchmarks (i.e., generating novel molecules to resemble the distribution of a given training set) and goal-directed benchmarks (i.e., generating molecules with target properties), and achieve significant improvements over previous fragment-based baselines. |
| Researcher Affiliation | Collaboration | Zijie Geng1 , Shufang Xie2 , Yingce Xia3 , Lijun Wu3, Tao Qin3, Jie Wang1,4 , Yongdong Zhang1, Feng Wu1, Tie-Yan Liu3 1 University of Science and Technology of China ustcgzj@mail.ustc.edu.cn, {jiewangx, zhyd73, fengwu}@ustc.edu.cn 2 Gaoling School of Artificial Intelligence, Renmin University of China shufangxie@ruc.edu.cn 3 Microsoft Research AI4Science {yingce.xia, lijunwu, taoqin, tyliu}@microsoft.com 4 Institute of Artificial Intelligence, Hefei Comprehensive National Science Center |
| Pseudocode | Yes | Algorithm 1: Connection-awared Motif Mining; Algorithm 2: Generating a molecule |
| Open Source Code | Yes | The code of Mi Ca M is available at https://github.com/MIRALab-USTC/AI4Sci-Mi Ca M. |
| Open Datasets | Yes | We evaluate our method on three datasets: QM9 (Ruddigkeit et al., 2012), ZINC (Irwin et al., 2012), and Guaca Mol (a post-processed Ch EMBL (Mendez et al., 2019) dataset proposed by Brown et al. (2019)). |
| Dataset Splits | No | βprior and βprop are hyperparameters to be determined according to validation performances. |
| Hardware Specification | Yes | We measure the training and sampling speed on a single Ge Force RTX 3090. |
| Software Dependencies | No | We employ GINE (Hu et al., 2019) as the GNN structures... The target values are computed using the RDKit library. |
| Experiment Setup | Yes | For QM9, we use a short warm-up (3, 000 steps), and use a long sigmoid schedule (400, 000 steps) (Bowman et al., 2015) to let βprior to reach 0.4. ... a small βprop (about 0.3) is beneficial. |