MolHF: A Hierarchical Normalizing Flow for Molecular Graph Generation
Authors: Yiheng Zhu, Zhenqiu Ouyang, Ben Liao, Jialu Wu, Yixuan Wu, Chang-Yu Hsieh, Tingjun Hou, Jian Wu
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that Mol HF achieves state-of-the-art performance in random generation and property optimization, implying its high capacity to model data distribution. Furthermore, Mol HF is the first flow-based model that can be applied to model larger molecules (polymer) with more than 100 heavy atoms. |
| Researcher Affiliation | Collaboration | 1College of Computer Science and Technology, Zhejiang University 2Polytechnic Institute, Zhejiang University 3Tencent Quantum Laboratory, Tencent 4College of Pharmaceutical Sciences, Zhejiang University 5School of Public Health, Zhejiang University 6Second Affiliated Hospital School of Medicine, Zhejiang University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and models are available at https://github.com/violet-sto/Mol HF. |
| Open Datasets | Yes | We evaluate our method on the ZINC250K dataset [Irwin et al., 2012] for a fair comparison. In addition, we also use the Polymer dataset [St. John et al., 2019] to verify that our method is capable of scaling to larger molecules. |
| Dataset Splits | Yes | Note that two different sets of molecules selected from the test set [Jin et al., 2018] or the entire set [Shi et al., 2020] are used in prior methods. We thus report results separately. |
| Hardware Specification | Yes | The efficiency is compared on the same computing facilities using 1 Tesla V100 GPU. |
| Software Dependencies | No | The paper does not provide specific version numbers for ancillary software dependencies. |
| Experiment Setup | Yes | For the generation and reconstruction, the latent variables are assumed to follow a prior isotropic Gaussian distribution N(0, σ2I), where σ is a learnable parameter denoting the standard deviation. For property optimization and hierarchical visualization, the same model trained on the ZINC250K dataset is used for all experiments. Further implementation details can be found in Appendix C. |