Molecular Representation Learning via Heterogeneous Motif Graph Neural Networks
Authors: Zhaoning Yu, Hongyang Gao
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results show that our model consistently outperforms previous state-of-the-art models. |
| Researcher Affiliation | Academia | 1Department of Computer Science, Iowa State University, Ames, the United State of America. Correspondence to: Zhaoning Yu <znyu@iastate.edu>, Hongyang Gao <hygao@iastate.edu>. |
| Pseudocode | Yes | Appendix A contains detailed pseudocode of constructing a Heterogeneous Motif Graph. Algorithm 1 is a pseudocode to show how to construct a Heterogeneous Motif Graph. Algorithm 2 is a pseudocode of minibatch HM-GNN. |
| Open Source Code | Yes | The code we used to train and evaluate our models is available online 1. 1https://github.com/Zhaoning Yu1996/HM-GNN. |
| Open Datasets | Yes | We compare our methods with previous state-of-the-art models on various benchmark datasets. Datasets details and experiment settings are provided in the Appendix C and D. We give detailed descriptions of datasets used in our experiments. Further details can be found in (Yanardag & Vishwanathan, 2015), (Zhang et al., 2019), and (Wu et al., 2018). |
| Dataset Splits | Yes | For each dataset, we perform 10-fold cross-validation with random splitting on the entire dataset. We report the mean and standard deviation of validation accuracy from ten folds. Following (Wu et al., 2018; Hu et al., 2020), we adopt the scaffold splitting procedure to split the dataset. We conduct the evaluations on three settings: 90%, 50%, and 10% for training and 10%, 50%, 90% for testing. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware used for its experiments, such as exact GPU or CPU models, memory, or accelerator types. |
| Software Dependencies | No | The paper mentions software like RDKIT and Adam optimizer but does not provide specific version numbers for any software dependencies required for replication. |
| Experiment Setup | Yes | For all configurations, 3 GNN layers are applied, and all MLPs have 2 layers. Batch normalization is applied to each layer, and dropout is applied to all layers except the first layer. The batch size is set to 2000. We use Adam optimizer with initial weight decay 0.0005. The hyper-parameters we tune for each dataset are: (1) the learning rate {0.01, 0.05}; (2) the number of hidden units {16, 64, 1024}; (3) the dropout ratio {0.2, 0.5}. |