Self-Supervised Graph Transformer on Large-Scale Molecular Data
Authors: Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying WEI, Wenbing Huang, Junzhou Huang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We pre-train GROVER with 100 million parameters on 10 million unlabelled molecules the biggest GNN and the largest training dataset in molecular representation learning. We then leverage the pre-trained GROVER for molecular property prediction followed by task-specific fine-tuning, where we observe a huge improvement (more than 6% on average) from current state-of-the-art methods on 11 challenging benchmarks. |
| Researcher Affiliation | Collaboration | 1Tencent AI Lab 2 Beijing National Research Center for Information Science and Technology (BNRist), Department of Computer Science and Technology, Tsinghua University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. It does not contain a specific repository link, explicit code release statement, or mention of code in supplementary materials. |
| Open Datasets | Yes | We collect 11 million (M) unlabelled molecules sampled from ZINC15 [48] and Chembl [11] datasets to pre-train GROVER... All datasets can be downloaded from http://moleculenet.ai/datasets-1 |
| Dataset Splits | Yes | We randomly split 10% of unlabelled molecules as the validation sets for model selection... We adopt the scaffold splitting method with a ratio for train/validation/test as 8:1:1. |
| Hardware Specification | Yes | We use 250 Nvidia V100 GPUs to pre-train GROVERbase and GROVERlarge. |
| Software Dependencies | No | The paper mentions software like Adam optimizer, Noam learning rate scheduler, and RDKit, but does not provide specific version numbers for these or other key software components. |
| Experiment Setup | Yes | We use Adam optimizer for both pre-train and fine-tuning. The Noam learning rate scheduler [9] is adopted to adjust the learning rate during training... For the contextual property prediction task, we set the context radius k = 1... For each molecular graph, we randomly mask 15% of node and edge labels for prediction... For each training process, we train models for 100 epochs. For hyper-parameters, we perform the random search on the validation set for each dataset and report the best results. |