Diffusion-based Molecule Generation with Informative Prior Bridges
Authors: Lemeng Wu, Chengyue Gong, Xingchao Liu, Mao Ye, Qiang Liu
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | With comprehensive experiments, we show that our method provides a powerful approach to the 3D generation task, yielding molecule structures with better quality and stability scores and more uniformly distributed point clouds of high qualities. |
| Researcher Affiliation | Academia | Lemeng Wu University of Texas at Austin lmwu@cs.utexas.edu Chengyue Gong University of Texas at Austin cygong@cs.utexas.edu Xingchao Liu University of Texas at Austin xcliu@cs.utexas.edu Mao Ye University of Texas at Austin my21@cs.utexas.edu Qiang Liu University of Texas at Austin lqiang@cs.utexas.edu |
| Pseudocode | Yes | Algorithm 1 Learning diffusion generative models. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | Dataset Settings QM9 [36] molecular properties and atom coordinates for 130k small molecules with up to 9 heavy atoms with 5 different types of atoms. ... GEOM-DRUG [4] is a dataset that contains drug-like molecules. ... We use the Shape Net [6] dataset for point cloud generation. |
| Dataset Splits | Yes | We follow the common practice in [19] to split the train, validation, and test partitions, with 100K, 18K, and 13K samples. |
| Hardware Specification | Yes | It takes approximately 10 days to train the model on these two datasets on one Tesla V100-SXM2-32GB GPU. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers). |
| Experiment Setup | Yes | On QM9, we train the EGNNs with 256 hidden features and 9 layers for 1100 epochs, a batch size 64, and a constant learning rate 10^-4, which is the default training configuration. We use the polynomial noise schedule used in [19] which linearly decay from 10^-2/T to 0. We linearly decay from 10^-3/T to 0 w.r.t. time step. We set k = 5 (7) by default. On GEOM-DRUG, we train the EGNNs with 256 hidden features and 8 layers with batch size 64, a constant learning rate 10^-4, and 10 epochs. |