Neural Message Passing for Quantum Chemistry

Authors: Justin Gilmer, Samuel S. Schoenholz, Patrick F. Riley, Oriol Vinyals, George E. Dahl

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using MPNNs we demonstrate state of the art results on an important molecular property prediction benchmark; these results are strong enough that we believe future work should focus on datasets with larger molecules or more accurate ground truth labels. In table 2 we compare the performance of our best MPNN variant (denoted with enn-s2s) and the corresponding ensemble (denoted with enn-s2s-ens5) with the previous state of the art on this dataset as reported in Faber et al. (2017).
Researcher Affiliation Industry 1Google Brain 2Google 3Google Deep Mind. Correspondence to: Justin Gilmer <gilmer@google.com>, George E. Dahl <gdahl@google.com>.
Pseudocode No The paper describes mathematical functions and steps but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not provide a statement about releasing open-source code or a link to a code repository for the described methodology.
Open Datasets Yes To investigate the success of MPNNs on predicting chemical properties, we use the publicly available QM9 dataset (Ramakrishnan et al., 2014).
Dataset Splits Yes The QM-9 dataset has 130462 molecules in it. We randomly chose 10000 samples for validation, 10000 samples for testing, and used the rest for training.
Hardware Specification No The paper mentions 'Xeon E5-2660 (2.2 GHz)' in the context of DFT calculations (a baseline), but does not specify the hardware used for training or running the MPNN experiments described in the paper.
Software Dependencies No The paper mentions optimizers (ADAM) and units (GRU) but does not list specific software dependencies with version numbers (e.g., PyTorch, TensorFlow, Python version) used for the experiments.
Experiment Setup Yes Each model and target combination was trained using a uniform random hyper parameter search with 50 trials. T was constrained to be in the range 3 T 8 (in practice, any T 3 works). The number of set2set computations M was chosen from the range 1 M 12. All models were trained using SGD with the ADAM optimizer (Kingma & Ba (2014)), with batch size 20 for 3 million steps ( 540 epochs). The initial learning rate was chosen uniformly between 1e 5 and 5e 4. We used a linear learning rate decay that began between 10% and 90% of the way through training and the initial learning rate l decayed to a final learning rate l F, using a decay factor F in the range [.01, 1].