Molecular Hypergraph Grammar with Its Application to Molecular Optimization

Authors: Hiroshi Kajino

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the effectiveness of MHG in the molecular optimization domain. ... We use the ZINC dataset following the existing work. ... Table 1. Reconstruction rate, predictive performance, and global molecular optimization with the unlimited oracle.
Researcher Affiliation Industry 1MIT-IBM Watson AI Lab; IBM Research, Tokyo, Japan. Correspondence to: Hiroshi Kajino <kajino@jp.ibm.com>.
Pseudocode Yes Algorithm 1 Latent Representation Inference In: Mols and targets, G0 = {gn}N n=1, Y0 = {yn}N n=1. ... Algorithm 2 Global Molecular Optimization In: Z0, Y0, Dec, #iterations K, #candidates M.
Open Source Code No The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets Yes We use the ZINC dataset following the existing work. This dataset is extracted from the ZINC database (Irwin et al., 2012) and contains 220,011 molecules for training, 24,445 for validation, and 5,000 for testing.
Dataset Splits Yes We use the ZINC dataset following the existing work. This dataset is extracted from the ZINC database (Irwin et al., 2012) and contains 220,011 molecules for training, 24,445 for validation, and 5,000 for testing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. It only vaguely mentions 'our environment' in relation to memory consumption for a baseline.
Software Dependencies No The paper mentions using 'GPy Opt (The GPy Opt authors, 2016)' but does not provide specific version numbers for software dependencies needed for replication.
Experiment Setup Yes For our method, we first obtain latent representations by Algorithm 1. Then, we apply PCA to the latent vectors to obtain 40-dimensional latent representations. Then, we run Algorithm 2 with M = 50, K = 5. ... For our method and JT-VAE, we initialize GP with N = 250 labeled molecules randomly selected from the training set, and run Algorithm 2 with M = 1, K = 250.