3D Infomax improves GNNs for Molecular Property Prediction
Authors: Hannes Stärk, Dominique Beaini, Gabriele Corso, Prudencio Tossou, Christian Dallago, Stephan Günnemann, Pietro Lió
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5. Experiments |
| Researcher Affiliation | Collaboration | 1EECS, Massachusetts Institute of Technology, Cambridge MA, USA 2Valence Discovery, Montreal, CA 3Department of Informatics, Technical University of Munich, DE 4Department of Computer Science and Technology, University of Cambridge, UK. |
| Pseudocode | No | The paper describes the model architecture and message passing equations in text and with mathematical formulas (e.g., equations 5, 6, 7), but it does not present any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code to 3D pre-train a GNN, to generate molecular fingerprint embeddings, or to reproduce results is available at https: //github.com/Hannes Stark/3DInfomax. |
| Open Datasets | Yes | The concrete 3D datasets of which we use subsets for pretraining are: QM9 (Ramakrishnan et al., 2014) which contains 134k small molecules (18 atoms on average) with a single conformer, GEOM-Drugs (Axelrod & Gomez Bombarelli, 2020) with 304k molecules and QMugs (Isert et al., 2021) with 665k. |
| Dataset Splits | Yes | The molecular properties are relevant for quantum mechanics, physical chemistry, biophysics, and physiology such that we can obtain a good estimate of how valuable our 3D pre-training is for each domain. For these properties, the interest is in how much our method can leverage this information and transfer it to molecules where no 3D geometry is available. Meanwhile, for biological or physiological properties such as blood-brain barrier penetration, it is not as clear if improvements from 3D information are to be expected. As such, this question needs to be answered next to how much of the benefits 3D pre-training recovers. For this purpose, we use the following molecular graph datasets, which are mainly taken from Molecule Net (Wu et al., 2017) and we use the scaffold splits6 with an 80/10/10 split ratio provided by OGB Hu et al. (2020a). |
| Hardware Specification | Yes | The first machine has an AMD Ryzen 1700 CPU @ 3.70Ghz, 16GB of RAM, and an Nvidia GTX 1060 GPU with 6GB v RAM. The second system contains two Intel Xeon Gold 6248 CPUs @ 2.50GHz each with 20/40 cores, 400GB of RAM, and four Quadro RTX 8000 GPUs with 46GB v RAM of which only a single one was used for each experiment. |
| Software Dependencies | No | All experiments were implemented in Py Torch (Paszke et al., 2017) using the deep learning libraries for processing graphs Pytorch Geometric (Fey & Lenssen, 2019) and Deep Graph Library (Wang et al., 2019). |
| Experiment Setup | Yes | Pre-training: We use Adam with a start learning rate of 8 10 5 and a batch size of 500. The learning rate schedule during pre-training starts with 700 optimization steps of linear warmup followed by the schedule given by the Reduce LROn Plateau scheduler by Py Torch2 with reduction parameter 0.6, patience 25, and a cooldown of 20. Fine-tuning quantum mechanical properties: We use Adam with a start learning rate of 7 10 5, weight decay 1 10 11 and a batch size of 128. For the learning rate schedule, we first perform warmup as follows... Reduce LROn Plateau with reduction parameter 0.5, patience 25, and a cooldown of 20. |