Pre-training via Denoising for Molecular Property Prediction

Authors: Sheheryar Zaidi, Michael Schaarschmidt, James Martens, Hyunjik Kim, Yee Whye Teh, Alvaro Sanchez-Gonzalez, Peter Battaglia, Razvan Pascanu, Jonathan Godwin

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that using this pre-training objective significantly improves performance on multiple benchmarks, achieving a new state-of-the-art on the majority of targets in the widely used QM9 dataset.
Researcher Affiliation Collaboration University of Oxford, Deep Mind
Pseudocode No The paper describes architectural details and processes, but does not include any figure, block, or section explicitly labeled "Pseudocode" or "Algorithm" with structured steps.
Open Source Code Yes Our code for experiments on Torch MD-NET is open source.3 Git Hub repository: https://github.com/shehzaidi/pre-training-via-denoising.
Open Datasets Yes First, the main dataset we use for pre-training is PCQM4Mv2 (Nakata & Shimazaki, 2017) (license: CC BY 4.0)... Second, as a dataset for fine-tuning, we use QM9 (Ramakrishnan et al., 2014) (license: CCBY 4.0)... Third, Open Catalyst 2020 (OC20) (Chanussot* et al., 2021) (OC20, license: CC Attribution 4.0)... Lastly, DES15K (Donchev et al., 2021) (license: CC0 1.0).
Dataset Splits Yes Following customary practice, hyperparameters, including the noise scale for denoising during pre-training and fine-tuning, are tuned on the HOMO target and then kept fixed for all other targets. ... Following prior work, we randomly split the dataset into 114k examples for training, 10k examples for validation and 10k examples for testing (for QM9).
Hardware Specification Yes GNS-TAT training for QM9, PCQM4Mv2 and DES15K was done on a cluster of 16 TPU v3 devices and evaluation on a single V100 device. GNS training for OC20 was done on 8 TPU v4 devices, with the exception of the 1.2 billion parameters variant of the model, which was trained on 64 TPU v4 devices. ... Models were trained on QM9 using data parallelism over two NVIDIA RTX 2080Ti GPUs.
Software Dependencies No The paper mentions software like JAX, Haiku, Jraph, and Torch MD-NET, but does not provide specific version numbers for these dependencies, which are necessary for reproducible software setup.
Experiment Setup Yes Detailed hyperparameter and hardware settings can be found in Appendices E and F. ... For GNS/GNS-TAT, we relied on the hyperparameters published by Godwin et al. (2022) but determined new noise values for pre-training and fine-tuning by tuning over the set of values {0.005, 0.01, 0.02, 0.05, 0.1} for each of PCQM4Mv2 and QM9 (on the HOMO energy target). ... Appendix F provides detailed tables of hyperparameters for GNS and GNS-TAT, including values for learning rate schedules, batch sizes, MLP layers, hidden sizes, noise parameters, etc.