Directional Message Passing for Molecular Graphs
Authors: Johannes Gasteiger, Janek Groß, Stephan Günnemann
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Dime Net outperforms previous GNNs on average by 76 % on MD17 and by 31 % on QM9. Our implementation is available online. We test Dime Net s performance for predicting molecular properties using the common QM9 benchmark (Ramakrishnan et al., 2014). |
| Researcher Affiliation | Academia | Johannes Gasteiger, Janek Groß & Stephan Günnemann Technical University of Munich, Germany {j.gasteiger,grossja,guennemann}@in.tum.de |
| Pseudocode | No | The paper includes architectural diagrams (Figure 4) and descriptions of the model components but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our implementation is available online.1 [footnote: 1https://www.daml.in.tum.de/dimenet] |
| Open Datasets | Yes | We test Dime Net s performance for predicting molecular properties using the common QM9 benchmark (Ramakrishnan et al., 2014). We use MD17 (Chmiela et al., 2017) to test model performance in molecular dynamics simulations. |
| Dataset Splits | Yes | We use 110 000 molecules in the training, 10 000 in the validation and 10 831 in the test set. This dataset is commonly used with 50 000 training and 10 000 validation and test samples. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions software components like AMSGrad, ResNet, and Swish activation, but it does not specify version numbers for any programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | For hyperparameter choices and training setup see Appendix B. We use 6 stacked interaction blocks and embeddings of size F = 128 throughout the model. For the basis functions we choose NSHBF = 7 and NSRBF = NRBF = 6. For the weight tensor in the interaction block we use Nbilinear = 8. We found the cutoff c = 5 Å and the learning rate 1 10 3 to be rather important hyperparameters. We optimized the model using AMSGrad (Reddi et al., 2018) with 32 molecules per mini-batch. We use a linear learning rate warm-up over 3000 steps and an exponential decay with ratio 0.1 every 2 000 000 steps. The model weights for validation and test were obtained using an exponential moving average (EMA) with decay rate 0.999. For MD17 we use the loss function from Eq. 2 with force weight ρ = 100, like previous models Schütt et al. (2017). We use early stopping on the validation loss. On QM9 we train for at most 3 000 000 and on MD17 for at most 100 000 steps. |