Geometric Latent Diffusion Models for 3D Molecule Generation

Authors: Minkai Xu, Alexander S Powers, Ron O. Dror, Stefano Ermon, Jure Leskovec

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that GEOLDM can consistently achieve better performance on multiple molecule generation benchmarks, with up to 7% improvement for the valid percentage of large biomolecules.
Researcher Affiliation Collaboration 1Department of Computer Science, Stanford University 2Department of Chemistry, Stanford University.
Pseudocode Yes Algorithm 1 Training Algorithm of GEOLDM; Algorithm 2 Sampling Algorithm of GEOLDM
Open Source Code Yes Code is provided at https: //github.com/Minkai Xu/Geo LDM.
Open Datasets Yes We first adopt QM9 dataset (Ramakrishnan et al., 2014) for both unconditional and conditional molecule generation. For the molecule generation task, we also test GEOLDM on the GEOM-DRUG (Geometric Ensemble Of Molecules) dataset.
Dataset Splits Yes Following (Anderson et al., 2019), we split the train, validation, and test partitions, with 100K, 18K, and 13K samples.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or specific computing infrastructure used for the experiments.
Software Dependencies No The paper mentions using 'RDKIT (Landrum, 2016)' and 'Py Torch (Paszke et al., 2017)' but does not provide specific version numbers for the software libraries themselves, only the publication years of their respective references.
Experiment Setup Yes We set the dimension of latent invariant features k to 1 for QM9 and 2 for DRUG... For the training of latent denoising network ϵθ: on QM9, we train EGNNs with 9 layers and 256 hidden features with a batch size 64; and on GEOM-DRUG, we train EGNNs with 4 layers and 256 hidden features, with batch size 64. For all the experiments, we choose the Adam optimizer (Kingma & Ba, 2014) with a constant learning rate of 10 4 as our default training configuration. The training on QM9 takes approximately 2000 epochs, and on DRUG takes 20 epochs.