Molecular Hypergraph Grammar with Its Application to Molecular Optimization
Authors: Hiroshi Kajino
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the effectiveness of MHG in the molecular optimization domain. ... We use the ZINC dataset following the existing work. ... Table 1. Reconstruction rate, predictive performance, and global molecular optimization with the unlimited oracle. |
| Researcher Affiliation | Industry | 1MIT-IBM Watson AI Lab; IBM Research, Tokyo, Japan. Correspondence to: Hiroshi Kajino <kajino@jp.ibm.com>. |
| Pseudocode | Yes | Algorithm 1 Latent Representation Inference In: Mols and targets, G0 = {gn}N n=1, Y0 = {yn}N n=1. ... Algorithm 2 Global Molecular Optimization In: Z0, Y0, Dec, #iterations K, #candidates M. |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | We use the ZINC dataset following the existing work. This dataset is extracted from the ZINC database (Irwin et al., 2012) and contains 220,011 molecules for training, 24,445 for validation, and 5,000 for testing. |
| Dataset Splits | Yes | We use the ZINC dataset following the existing work. This dataset is extracted from the ZINC database (Irwin et al., 2012) and contains 220,011 molecules for training, 24,445 for validation, and 5,000 for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It only vaguely mentions 'our environment' in relation to memory consumption for a baseline. |
| Software Dependencies | No | The paper mentions using 'GPy Opt (The GPy Opt authors, 2016)' but does not provide specific version numbers for software dependencies needed for replication. |
| Experiment Setup | Yes | For our method, we first obtain latent representations by Algorithm 1. Then, we apply PCA to the latent vectors to obtain 40-dimensional latent representations. Then, we run Algorithm 2 with M = 50, K = 5. ... For our method and JT-VAE, we initialize GP with N = 250 labeled molecules randomly selected from the training set, and run Algorithm 2 with M = 1, K = 250. |