Learning Harmonic Molecular Representations on Riemannian Manifold
Authors: Yiqun Wang, Yuning Shen, Shi Chen, Lihao Wang, Fei YE, Hao Zhou
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our proposed method shows comparable predictive power to current models in small molecule property prediction, and outperforms the state-of-the-art deep learning models for ligand-binding protein pocket classification and the rigid protein docking challenge, demonstrating its versatility in molecular representation learning. (Abstract) and 5 EXPERIMENTS (Section 5 title) |
| Researcher Affiliation | Collaboration | Yiqun Wang1, Yuning Shen1, Shi Chen2 , Lihao Wang1, Fei Ye1, Hao Zhou3 1Byte Dance Research, 2University of Wisconsin-Madison, 3Institute for AI Industry Research (AIR), Tsinghua University |
| Pseudocode | No | The paper describes its methods through text and mathematical equations but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and data are available at https://github.com/Geom Mol Design/HMR. |
| Open Datasets | Yes | QM9 raw dataset is provided at https://springernature.figshare.com/ndownloader/ files/3195389. The dataset for the ligand-binding pocket classification is provided at https://zenodo.org/record/2625420 and the split used by Ma SIF is at https: //github.com/LPDI-EPFL/masif/tree/master/data/masif_ligand/lists. DIPS dataset can be downloaded from the following website https://github.com/ Bioinfo Machine Learning/DIPS-Plus. |
| Dataset Splits | Yes | QM9: We follow the same data split as EGNN, leading to 130,650 instances (99,862 for training, 17,719 for validation, and 13,069 for test). (Appendix F). Ligand-binding: The same training, validation, and test split as Gainza et al. (2020) are used and we obtained 1,634 training pockets (in 986 protein complexes), 202 validation pockets (in 112 protein complexes), and 418 test pockets (in 274 protein complexes). (Appendix G). Rigid Protein Docking: After removing proteins failed in surface mesh generation, DIPS-Het contains 11,373 training cases and 508 validation cases. (Appendix H.1) |
| Hardware Specification | Yes | We trained our model on NVIDIA A100 GPUs with 80 GB memory with a batch size of 32, which on average takes 240 seconds to train a single epoch (99,862 molecules), and 16 seconds for inference on the test set (13,069 molecules). |
| Software Dependencies | No | The paper mentions software like MSMS (Ewing & Hermisson, 2010), Py Mesh (Zhou, 2019), libigl (Jacobson & Panozzo, 2017), and scipy (eigsh) for various steps, but it does not consistently provide specific version numbers for these software dependencies as required for reproducibility. |
| Experiment Setup | Yes | Table H.7: Hyperparameter choices of HMR and the training phase settings (Appendix H.10) provides specific values for Batch Size, Epoch, Learning Rate, Optimizer, Dropout Rate, and other parameters used in training. |