reproducibilityindex.ai

Molecular Representation Learning via Heterogeneous Motif Graph Neural Networks

Authors: Zhaoning Yu, Hongyang Gao

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results show that our model consistently outperforms previous state-of-the-art models.
Researcher Affiliation	Academia	1Department of Computer Science, Iowa State University, Ames, the United State of America. Correspondence to: Zhaoning Yu <znyu@iastate.edu>, Hongyang Gao <hygao@iastate.edu>.
Pseudocode	Yes	Appendix A contains detailed pseudocode of constructing a Heterogeneous Motif Graph. Algorithm 1 is a pseudocode to show how to construct a Heterogeneous Motif Graph. Algorithm 2 is a pseudocode of minibatch HM-GNN.
Open Source Code	Yes	The code we used to train and evaluate our models is available online 1. 1https://github.com/Zhaoning Yu1996/HM-GNN.
Open Datasets	Yes	We compare our methods with previous state-of-the-art models on various benchmark datasets. Datasets details and experiment settings are provided in the Appendix C and D. We give detailed descriptions of datasets used in our experiments. Further details can be found in (Yanardag & Vishwanathan, 2015), (Zhang et al., 2019), and (Wu et al., 2018).
Dataset Splits	Yes	For each dataset, we perform 10-fold cross-validation with random splitting on the entire dataset. We report the mean and standard deviation of validation accuracy from ten folds. Following (Wu et al., 2018; Hu et al., 2020), we adopt the scaffold splitting procedure to split the dataset. We conduct the evaluations on three settings: 90%, 50%, and 10% for training and 10%, 50%, 90% for testing.
Hardware Specification	No	The paper does not provide specific details regarding the hardware used for its experiments, such as exact GPU or CPU models, memory, or accelerator types.
Software Dependencies	No	The paper mentions software like RDKIT and Adam optimizer but does not provide specific version numbers for any software dependencies required for replication.
Experiment Setup	Yes	For all configurations, 3 GNN layers are applied, and all MLPs have 2 layers. Batch normalization is applied to each layer, and dropout is applied to all layers except the first layer. The batch size is set to 2000. We use Adam optimizer with initial weight decay 0.0005. The hyper-parameters we tune for each dataset are: (1) the learning rate {0.01, 0.05}; (2) the number of hidden units {16, 64, 1024}; (3) the dropout ratio {0.2, 0.5}.