An Image-enhanced Molecular Graph Representation Learning Framework

Authors: Hongxin Xiang, Shuting Jin, Jun Xia, Man Zhou, Jianmin Wang, Li Zeng, Xiangxiang Zeng

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In particular, Graph MVP and Mole BERT equipped with IEM achieve new state-of-the-art performance on Molecule Net benchmark, achieving average 73.89% and 73.81% ROC-AUC, respectively.
Researcher Affiliation Collaboration Hongxin Xiang1,2 , Shuting Jin3 , Jun Xia4 , Man Zhou5 , Jianmin Wang6 , Li Zeng2 , Xiangxiang Zeng1, 1College of Computer Science and Electronic Engineering, Hunan University, Changsha, China 2Department of AIDD, Shanghai Yuyao Biotechnology Co., Ltd., Shanghai, China 3School of Computer Science & Technology, Wuhan University of Science and Technology, Wuhan, China 4School of Engineering, Westlake University, Hangzhou, China 5University of Science and Technology of China, Hefei, China 6The Interdisciplinary Graduate Program in Integrative Biotechnology, Yonsei University, Incheon, Korea
Pseudocode No The paper describes the method using figures and equations, but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/Hongxin Xiang/IEM.
Open Datasets Yes For pre-training teacher model, we sample 2 millions unlabeled molecules with 3D conformations from PCQM4Mv2 database [Hu et al., 2017]. In evaluation stage, we use the widely-used 8 binary classification datasets from Molecule Net [Wu et al., 2018] with ROC-AUC metric.
Dataset Splits Yes Notably, we use strict scaffold splitting [Hu et al., 2020a] to divide all datasets into training set, validation set and test set according to 8:1:1.
Hardware Specification No The paper does not specify the hardware used for experiments, such as specific GPU or CPU models.
Software Dependencies No The paper mentions architectural details like Res Net-18, but does not list specific software libraries or their version numbers used in the implementation.
Experiment Setup Yes The teacher model is pre-trained for more than 30 epochs (about 450k steps) with temperature of 0.1, batch-size of 128 and learning rate of 0.01... we train for 100 epochs with batch-size of 32 and learning rate of 0.001. We select hyper-parameters λKE and λT E from {0.001, 0.01, 0.1, 1, 5} and report test scores corresponding to the best validation performance.