Von Mises Mixture Distributions for Molecular Conformation Generation

Authors: Kirk Swanson, Jake Lawrence Williams, Eric M Jonas

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that Von Mises Net can generate conformations for arbitrary molecules in a way that is both physically accurate with respect to the Boltzmann distribution and orders of magnitude faster than existing sampling methods. and In this section, we evaluate the speed and accuracy of Von Mises Net.
Researcher Affiliation Academia 1Department of Computer Science, University of Chicago, Chicago, USA. Correspondence to: Kirk Swanson <swansonk1@uchicago.edu>.
Pseudocode No The paper illustrates the architecture and describes the process in text and figures, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/thejonaslab/vonmisesicml-2023.
Open Datasets Yes We used PT-HMC to generate conformations for two datasets of molecules: NMRShift DB and GDB-17 (see details in Appendix B). and For NMRShift DB we took 32,171 molecules from NMRShift DB (Kuhn, 2019) and For GDB-17 we took a random 134,228 molecule subset of the publically-available 50M lead-like molecules made available by the GDB-17 enumeration of chemical space (Ruddigkeit et al., 2012).
Dataset Splits No We split each of these datasets into training (NMRShift DB-train and GDB-17-train) and test (NMRShift DB-test and GDB-17-test) datasets by computing a hash... This produces an 80/20 train/test split. The paper mentions 'validation loss' in Appendix J, implying a validation set, but does not provide its specific size or split methodology from the training data.
Hardware Specification Yes 100 conformers were generated for each molecule on a 64-core machine that has a single NVIDIA Ge Force RTX 2080 Ti GPU. and Training on NMRShift DB-train took 7.7 hours and training on GDB-17-train took 16.7 hours on a single NVIDIA Ge Force RTX 2080 Ti GPU.
Software Dependencies No The paper mentions software like RDKit and MMFF but does not provide specific version numbers for these or other key dependencies.
Experiment Setup Yes We use 20 graph convolution layers, a hidden size of 256, a batch size of 32, and the Adam optimizer with a learning rate of 0.0001. We use gradient clipping for all of the model parameters with a cutoff value of 1.0.