Learning Neural Generative Dynamics for Molecular Conformation Generation
Authors: Minkai Xu, Shitong Luo, Yoshua Bengio, Jian Peng, Jian Tang
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct comprehensive experiments on several recently proposed benchmarks, including GEOM-QM9, GEOM-Drugs (Axelrod & Gomez-Bombarelli, 2020) and ISO17 (Simm & Hern andez-Lobato, 2020). Numerical evaluations show that our proposed framework consistently outperforms the previous state-of-the-art (Graph DG) on both conformation generation and distance modeling tasks, with a clear margin. |
| Researcher Affiliation | Collaboration | 1Mila Qu ebec AI Institute, Canada 2Universit e de Montr eal, Canada 3Peking University, China 4Canadian Institute for Advanced Research (CIFAR), Canada 5University of Illinois at Urbana-Champaign, USA 6HEC Montr eal, Canada |
| Pseudocode | Yes | Algorithm 1 Sampling Procedure of the Proposed Method Input: molecular graph G, CGCF model with parameter θ, ETM with parameter φ, the number of optimization steps for p(R|d, G) M and its step size r, the number of MCMC steps for Eθ,φ N and its step size ϵ Output: molecular conformation R |
| Open Source Code | Yes | 1Code is available at https://github.com/Deep Graph Learning/CGCF-Conf Gen. |
| Open Datasets | Yes | We use the recent proposed GEOM-QM9 and GEOM-Drugs (Axelrod & Gomez Bombarelli, 2020) datasets for conformation generation task and ISO17 dataset (Simm & Hern andez-Lobato, 2020) for distances modeling task. |
| Dataset Splits | Yes | GEOM-QM9 is an extension to the QM9 (Ramakrishnan et al., 2014) dataset: it contains multiple conformations for most molecules while the original QM9 only contains one. ... We randomly draw 50000 conformation-molecule pairs from GEOM-QM9 to be the training set, and take another 17813 conformations covering 150 molecular graphs as the test set. ... We randomly take 50000 conformationmolecule pairs from GEOM-Drugs as the training set, and another 9161 conformations (covering 100 molecular graphs) as the test split. ISO17 dataset is also built upon QM9 datasets, which consists of 197 molecules, each with 5000 conformations. Following Simm & Hern andez-Lobato (2020), we also split ISO17 into the training set with 167 molecules and the test set with another 30 molecules. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types) used for running its experiments. It only mentions implementation in PyTorch. |
| Software Dependencies | Yes | Our model is implemented in Py Torch (Paszke et al., 2017). |
| Experiment Setup | Yes | The MPNN in CGCF is implemented with 3 layers, and the embedding dimension is set as 128. And the Sch Net in ETM is implemented with 6 layers with the embedding dimension set as 128. We train our CGCF with a batch size of 128 and a learning rate of 0.001 until convergence. After obtaining the CGCF, we train the ETM with a batch size of 384 and a learning rate of 0.001 until convergence. For all experimental settings, we use Adam (Kingma & Ba, 2014) to optimize our model. |