reproducibilityindex.ai

Strategies for Pre-training Graph Neural Networks

Authors: Weihua Hu*, Bowen Liu*, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, Jure Leskovec

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We systematically study pre-training on multiple graph classiﬁcation datasets. We ﬁnd that naïve strategies... improves generalization signiﬁcantly across downstream tasks, leading up to 9.4% absolute improvements in ROC-AUC over non-pre-trained models and achieving state-of-the-art performance for molecular property prediction and protein function prediction. 5 EXPERIMENTS, 5.1 DATASETS, 5.3 RESULTS, Table 1: Test ROC-AUC (%) performance...
Researcher Affiliation	Academia	Weihua Hu1 , Bowen Liu2 , Joseph Gomes4, Marinka Zitnik5, Percy Liang1, Vijay Pande3, Jure Leskovec1 1Department of Computer Science, 2Chemistry, 3Bioengineering, Stanford University, 4Department of Chemical and Biochemical Engineering, The University of Iowa, 5Department of Biomedical Informatics, Harvard University
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Project website, data and code: http://snap.stanford.edu/gnn-pretrain
Open Datasets	Yes	We release the new datasets at: http://snap.stanford.edu/gnn-pretrain. For the chemistry domain, we use 2 million unlabeled molecules sampled from the ZINC15 database (Sterling & Irwin, 2015)... For graph-level multi-task supervised pre-training, we use a preprocessed Ch EMBL dataset (Mayr et al., 2018; Gaulton et al., 2011)... as our downstream tasks, we decided to use 8 larger binary classiﬁcation datasets contained in Molecule Net (Wu et al., 2018)...
Dataset Splits	Yes	The split for train/validation/test sets is 80%:10%:10%. ... The effective split ratio for the train/validation/prior/test sets is 69% : 12% : 9.5% : 9.5%.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only general information about training time.
Software Dependencies	Yes	We use Pytorch (Paszke et al., 2017) and Pytorch Geometric (Fey & Lenssen, 2019) for all of our implementation.
Experiment Setup	Yes	We select the following hyper-parameters that performed well across all downstream tasks in the validation sets: 300 dimensional hidden units, 5 GNN layers (K = 5), and average pooling for the READOUT function. ... All models are trained with Adam optimizer (Kingma & Ba, 2015) with a learning rate of 0.001. ... For self-supervised pre-training, we use a batch size of 256, while for supervised pre-training, we use a batch size of 32 with dropout rate of 20%. ... We use a batch size of 32 and dropout rate of 50%. ... train models for 100 epochs, while on the protein function prediction dataset...we train models for 50 epochs.