Equivariant Blurring Diffusion for Hierarchical Molecular Conformer Generation
Authors: Jiwoong Park, Yang Shen
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate our hierarchical molecular conformer generation framework via Equivariant Blurring Diffusion (EBD) on molecular conformer generation task. We conducted experiments to answer the following questions: i) Ablation studies (Sec. 5.2): What are the effects of granularity of the fragment vocabulary, loss reparameterization, and data corruption processes of diffusion models? ii) Geometric evaluation (Sec. 5.3): Can EBD generate more diverse and accurate molecular conformers in Euclidean space than previous deep generative approaches? iii) Property prediction (Sec. 5.4): Can EBD generate low-energy, stable conformers? |
| Researcher Affiliation | Academia | Jiwoong Park, Yang Shen Department of Electrical and Computer Engineering Texas A&M University ptywoong@gmail.com, yshen@tamu.edu |
| Pseudocode | Yes | In this subsection, we provide the Pytorch-style [37] pseudo-codes. The RDKit conformer generator to obtain the approximate fragment structure, linear interpolation blurring schedule, training process, and sampling process were given in Pseudo-codes 1, 2, 3, and 4, respectively. |
| Open Source Code | Yes | Codes are released at https://github.com/Shen-Lab/EBD. |
| Open Datasets | Yes | We use GEOM-QM9 (QM9) [39] and GEOM-Drugs (Drugs) [1] which are small molecules and drug-like molecules, respectively. We obtained the raw data, the pre-processed data and the data split at https://github.com/Deep Graph Learning/Conf GF. |
| Dataset Splits | Yes | Each dataset comprises 40,000 molecules for the training set and 5,000 molecules for the validation set, with each molecule containing 5 conformers following data split of [46]. |
| Hardware Specification | Yes | We used a single NVIDIA A100 GPU for every training and generation tasks. |
| Software Dependencies | No | The paper mentions using PyTorch and RDKit but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | For EBD, we use the T = 50, a noise scale of 0.01 for the forward process (σ in Eq. 6) and 0.0125 for the reverse process (δ in Eq. 8) in every experiments. For training, we used a learning rate 10 4 with the Adam W optimizer [33]. Table 7: Hyperparameters of EBD. Dataset T # l # d # of hops cutoff batch size training iter. Drugs 50 6 128 3 10 Å 32 650k QM9 50 6 128 3 10 Å 64 650k |