Geometric Latent Diffusion Models for 3D Molecule Generation
Authors: Minkai Xu, Alexander S Powers, Ron O. Dror, Stefano Ermon, Jure Leskovec
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that GEOLDM can consistently achieve better performance on multiple molecule generation benchmarks, with up to 7% improvement for the valid percentage of large biomolecules. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, Stanford University 2Department of Chemistry, Stanford University. |
| Pseudocode | Yes | Algorithm 1 Training Algorithm of GEOLDM; Algorithm 2 Sampling Algorithm of GEOLDM |
| Open Source Code | Yes | Code is provided at https: //github.com/Minkai Xu/Geo LDM. |
| Open Datasets | Yes | We first adopt QM9 dataset (Ramakrishnan et al., 2014) for both unconditional and conditional molecule generation. For the molecule generation task, we also test GEOLDM on the GEOM-DRUG (Geometric Ensemble Of Molecules) dataset. |
| Dataset Splits | Yes | Following (Anderson et al., 2019), we split the train, validation, and test partitions, with 100K, 18K, and 13K samples. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU types, or specific computing infrastructure used for the experiments. |
| Software Dependencies | No | The paper mentions using 'RDKIT (Landrum, 2016)' and 'Py Torch (Paszke et al., 2017)' but does not provide specific version numbers for the software libraries themselves, only the publication years of their respective references. |
| Experiment Setup | Yes | We set the dimension of latent invariant features k to 1 for QM9 and 2 for DRUG... For the training of latent denoising network ϵθ: on QM9, we train EGNNs with 9 layers and 256 hidden features with a batch size 64; and on GEOM-DRUG, we train EGNNs with 4 layers and 256 hidden features, with batch size 64. For all the experiments, we choose the Adam optimizer (Kingma & Ba, 2014) with a constant learning rate of 10 4 as our default training configuration. The training on QM9 takes approximately 2000 epochs, and on DRUG takes 20 epochs. |