Learning to Extend Molecular Scaffolds with Structural Motifs

Authors: Krzysztof Maziarz, Henry Richard Jackson-Flux, Pashmina Cameron, Finton Sirockin, Nadine Schneider, Nikolaus Stiefl, Marwin Segler, Marc Brockschmidt

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that Mo Le R performs comparably to state-of-the-art methods on unconstrained molecular optimization tasks, and outperforms them on scaffold-based tasks, while being an order of magnitude faster to train and sample from than existing approaches.
Researcher Affiliation Industry Krzysztof Maziarz Microsoft Research United Kingdom Henry Jackson-Flux Microsoft Research United Kingdom Pashmina Cameron Microsoft Research United Kingdom Finton Sirockin Novartis Switzerland Nadine Schneider Novartis Switzerland Nikolaus Stiefl Novartis Switzerland Marwin Segler Microsoft Research United Kingdom Marc Brockschmidt Microsoft Research United Kingdom
Pseudocode Yes Algorithm 1 Mo Le R s Generative Procedure and Algorithm 2 Determining a generation order
Open Source Code Yes Code is available at https://github.com/microsoft/molecule-generation.
Open Datasets Yes We use training data from Guaca Mol (Brown et al., 2019), which released a curated set of 1.5M drug-like molecules, divided into train, validation and test sets.
Dataset Splits Yes We use training data from Guaca Mol (Brown et al., 2019), which released a curated set of 1.5M drug-like molecules, divided into train, validation and test sets.
Hardware Specification Yes For all measurements in Table 1, we used a machine with a single Tesla K80 GPU.
Software Dependencies Yes Our own implementations (Mo Le R, CGVAE) are based on TensorFlow 2 (Abadi et al., 2016), while the models of Jin et al. (2018; 2020) (JT-VAE, Hier VAE) use PyTorch (Paszke et al., 2019).
Experiment Setup Yes We train our model using the Adam optimizer (Kingma & Ba, 2014). We found that adding an initial warm-up phase for the KL loss coefficient λprior (i.e. increasing it from 0 to a target value over the course of training) helps to stabilize the model. ... We cap the total number of nodes rather than the total number of molecules, as that is more robust to varying sizes of molecules in the training data. ... For vocabulary sizes up to 32 we used λprior = 0.01, and then followed the logarithmic trend described here.