reproducibilityindex.ai

Motif-based Graph Self-Supervised Learning for Molecular Property Prediction

Authors: ZAIXI ZHANG, Qi Liu, Hao Wang, Chengqiang Lu, Chee-Kong Lee

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on various downstream benchmark tasks show that our methods outperform all state-of-the-art baselines.
Researcher Affiliation	Collaboration	Zaixi Zhang1, Qi Liu1 , Hao Wang1, Chengqiang Lu1, Chee-Kong Lee2 1: Anhui Province Key Lab of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China 2: Tencent America
Pseudocode	Yes	The pseudo codes of the training process is included in the Appendix.
Open Source Code	Yes	The implementation is publicly available at https://github.com/zaixizhang/MGSSL.
Open Datasets	Yes	we use 250k unlabeled molecules sampled from the ZINC15 database [38] for self-supervised pre-training tasks. As for the downstream ﬁnetune tasks, we consider 8 binary classiﬁcation benchmark datasets contained in Molecule Net [45].
Dataset Splits	Yes	The split for train/validation/test sets is 80% : 10% : 10%.
Hardware Specification	Yes	All experiments are conducted on Tesla V100 GPUs.
Software Dependencies	No	The paper mentions using "the open-source package RDKit [22]" but does not specify a version number for it or any other software dependencies.
Experiment Setup	Yes	In the process of pre-training, GNNs are pre-trained for 100 epochs with Adam optimizer and learning rate 0.001. In the ﬁnetuning stage, we train for 100 epochs and report the testing score with the best cross-validation performance. The hidden dimension is set to 300 and the batch size is set to 32 for pre-training and ﬁnetuning.