reproducibilityindex.ai

Simple GNN Regularisation for 3D Molecular Property Prediction and Beyond

Authors: Jonathan Godwin, Michael Schaarschmidt, Alexander L Gaunt, Alvaro Sanchez-Gonzalez, Yulia Rubanova, Petar Veličković, James Kirkpatrick, Peter Battaglia

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our regulariser applies well-studied methods in simple, straightforward ways which allow even generic architectures to overcome oversmoothing and achieve state of the art results on quantum chemistry tasks, and improve results signiﬁcantly on Open Graph Benchmark (OGB) datasets. Our results suggest Noisy Nodes can serve as a complementary building block in the GNN toolkit.
Researcher Affiliation	Industry	Jonathan Godwin, Michael Schaarschmidt, Alexander Gaunt, Alvaro Sanchez-Gonzales, Yulia Rubanova, Petar Veliˇckovi c, James Kirkpatrick & Peter Battaglia Deep Mind, London {jonathangodwin}@deepmind.com
Pseudocode	Yes	A.9 PSEUDOCODE FOR 3D MOLECULAR PREDICTION TRAINING STEP Algorithm 1: Noisy Nodes Training Step
Open Source Code	Yes	Code for reproducing OGB-PCQM4M results using Noisy Nodes is available on github, and was prepared as part of a leaderboard submission. https://github.com/deepmind/ deepmind-research/tree/master/ogb_lsc/pcq.
Open Datasets	Yes	We tested this architecture on three challenging molecular property prediction benchmarks: OC20 (Chanussot* et al., 2020) IS2RS & IS2RE, and QM9 (Ramakrishnan et al., 2014). This dataset from the OGB benchmarks consists of molecular graphs which consist of bonds and atom types, and no 3D or 2D coordinates.
Dataset Splits	Yes	We use 114k molecules for training, 10k for validation and 10k for test. Four canonical validation datasets are also provided.
Hardware Specification	Yes	All training experiments were ran on a cluster of TPU devices. For the Open Catalyst experiments, each individual run (i.e. a single random seed) utilised 8 TPU devices on 2 hosts (4 per host) for training, and 4 V100 GPU devices for evaluation (1 per dataset). QM9. Experiments were also run on TPU devices. Each seed was run using 8 TPU devices on a single host for training, and 2 V100 GPU devices for evaluation.
Software Dependencies	No	Our code base is implemented in JAX using Haiku and Jraph for GNNs, and Optax for training (Bradbury et al., 2018; Babuschkin et al., 2020; Godwin* et al., 2020; Hennigan et al., 2020).
Experiment Setup	Yes	A.11 HYPER-PARAMETERS Open Catalyst. We list the hyper-parameters used to train the default Open Catalyst experiment. If not speciﬁed otherwise (e.g. in ablations of these parameters), experiments were ran with this conﬁguration. (Tables 13, 14, 15, 16, 17 provide detailed parameter values like Optimiser, learning rates, batch sizes, MLP layers, hidden sizes, noise parameters, etc.)