Pre-training via Denoising for Molecular Property Prediction
Authors: Sheheryar Zaidi, Michael Schaarschmidt, James Martens, Hyunjik Kim, Yee Whye Teh, Alvaro Sanchez-Gonzalez, Peter Battaglia, Razvan Pascanu, Jonathan Godwin
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that using this pre-training objective significantly improves performance on multiple benchmarks, achieving a new state-of-the-art on the majority of targets in the widely used QM9 dataset. |
| Researcher Affiliation | Collaboration | University of Oxford, Deep Mind |
| Pseudocode | No | The paper describes architectural details and processes, but does not include any figure, block, or section explicitly labeled "Pseudocode" or "Algorithm" with structured steps. |
| Open Source Code | Yes | Our code for experiments on Torch MD-NET is open source.3 Git Hub repository: https://github.com/shehzaidi/pre-training-via-denoising. |
| Open Datasets | Yes | First, the main dataset we use for pre-training is PCQM4Mv2 (Nakata & Shimazaki, 2017) (license: CC BY 4.0)... Second, as a dataset for fine-tuning, we use QM9 (Ramakrishnan et al., 2014) (license: CCBY 4.0)... Third, Open Catalyst 2020 (OC20) (Chanussot* et al., 2021) (OC20, license: CC Attribution 4.0)... Lastly, DES15K (Donchev et al., 2021) (license: CC0 1.0). |
| Dataset Splits | Yes | Following customary practice, hyperparameters, including the noise scale for denoising during pre-training and fine-tuning, are tuned on the HOMO target and then kept fixed for all other targets. ... Following prior work, we randomly split the dataset into 114k examples for training, 10k examples for validation and 10k examples for testing (for QM9). |
| Hardware Specification | Yes | GNS-TAT training for QM9, PCQM4Mv2 and DES15K was done on a cluster of 16 TPU v3 devices and evaluation on a single V100 device. GNS training for OC20 was done on 8 TPU v4 devices, with the exception of the 1.2 billion parameters variant of the model, which was trained on 64 TPU v4 devices. ... Models were trained on QM9 using data parallelism over two NVIDIA RTX 2080Ti GPUs. |
| Software Dependencies | No | The paper mentions software like JAX, Haiku, Jraph, and Torch MD-NET, but does not provide specific version numbers for these dependencies, which are necessary for reproducible software setup. |
| Experiment Setup | Yes | Detailed hyperparameter and hardware settings can be found in Appendices E and F. ... For GNS/GNS-TAT, we relied on the hyperparameters published by Godwin et al. (2022) but determined new noise values for pre-training and fine-tuning by tuning over the set of values {0.005, 0.01, 0.02, 0.05, 0.1} for each of PCQM4Mv2 and QM9 (on the HOMO energy target). ... Appendix F provides detailed tables of hyperparameters for GNS and GNS-TAT, including values for learning rate schedules, batch sizes, MLP layers, hidden sizes, noise parameters, etc. |