Directional Message Passing for Molecular Graphs

Authors: Johannes Gasteiger, Janek Groß, Stephan Günnemann

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Dime Net outperforms previous GNNs on average by 76 % on MD17 and by 31 % on QM9. Our implementation is available online. We test Dime Net s performance for predicting molecular properties using the common QM9 benchmark (Ramakrishnan et al., 2014).
Researcher Affiliation Academia Johannes Gasteiger, Janek Groß & Stephan Günnemann Technical University of Munich, Germany {j.gasteiger,grossja,guennemann}@in.tum.de
Pseudocode No The paper includes architectural diagrams (Figure 4) and descriptions of the model components but does not provide any structured pseudocode or algorithm blocks.
Open Source Code Yes Our implementation is available online.1 [footnote: 1https://www.daml.in.tum.de/dimenet]
Open Datasets Yes We test Dime Net s performance for predicting molecular properties using the common QM9 benchmark (Ramakrishnan et al., 2014). We use MD17 (Chmiela et al., 2017) to test model performance in molecular dynamics simulations.
Dataset Splits Yes We use 110 000 molecules in the training, 10 000 in the validation and 10 831 in the test set. This dataset is commonly used with 50 000 training and 10 000 validation and test samples.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions software components like AMSGrad, ResNet, and Swish activation, but it does not specify version numbers for any programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes For hyperparameter choices and training setup see Appendix B. We use 6 stacked interaction blocks and embeddings of size F = 128 throughout the model. For the basis functions we choose NSHBF = 7 and NSRBF = NRBF = 6. For the weight tensor in the interaction block we use Nbilinear = 8. We found the cutoff c = 5 Å and the learning rate 1 10 3 to be rather important hyperparameters. We optimized the model using AMSGrad (Reddi et al., 2018) with 32 molecules per mini-batch. We use a linear learning rate warm-up over 3000 steps and an exponential decay with ratio 0.1 every 2 000 000 steps. The model weights for validation and test were obtained using an exponential moving average (EMA) with decay rate 0.999. For MD17 we use the loss function from Eq. 2 with force weight ρ = 100, like previous models Schütt et al. (2017). We use early stopping on the validation loss. On QM9 we train for at most 3 000 000 and on MD17 for at most 100 000 steps.