A Model to Search for Synthesizable Molecules
Authors: John Bradshaw, Brooks Paige, Matt J. Kusner, Marwin Segler, José Miguel Hernández-Lobato
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we evaluate MOLECULE CHEF in (1) its ability to generate a diverse set of valid molecules; (2) how useful its learnt latent space is when optimizing product molecules for some property; and (3) whether by training a regressor back from product molecules to the latent space, MOLECULE CHEF can be used as part of a setup to perform retrosynthesis. ... The results are shown in Table 1. |
| Researcher Affiliation | Collaboration | John Bradshaw University of Cambridge MPI for Intelligent Systems jab255@cam.ac.uk Brooks Paige University of Cambridge The Alan Turing Institute bpaige@turing.ac.uk Matt J. Kusner University College London The Alan Turing Institute m.kusner@ucl.ac.uk Marwin H. S. Segler Benevolent AI Westfälische Wilhelms-Universität Münster marwin.segler@benevolent.ai José Miguel Hernández-Lobato University of Cambridge The Alan Turing Institute Microsoft Research Cambridge jmh233@cam.ac.uk |
| Pseudocode | Yes | Algorithm 1 MOLECULE CHEF s Decoder |
| Open Source Code | Yes | Further details can also be found in our appendix and code is available at https://github.com/john-bradshaw/molecule-chef |
| Open Datasets | Yes | In order to train our model we need a dataset of reactant bags. For this we use the USPTO dataset [31], processed and cleaned up by Jin et al. [19]. |
| Dataset Splits | Yes | We filter our training (using Jin et al. [19] s split) dataset so that each reaction only contains reactants that occur at least 15 times across different reactions in the original larger training USPTO dataset. ... We evaluate on a filtered version of Jin et al. [19] s test set split of USPTO, where we have filtered out any reactions which have the exact same reactant and product multisets as a reaction present in the set used to train Molecule Chef. In addition, we further split this filtered set into two sets: (i) Reachable Products , which are reactions in the test set that contain as reactants only molecules that are in MOLECULE CHEF s reactant vocabulary, and (ii) Unreachable Products , which have at least one reactant molecule that is not in the vocabulary. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like 'RDKit' and 'Molecular Transformer', but it does not specify version numbers for these or any other software dependencies, which would be necessary for reproducible replication. |
| Experiment Setup | No | The paper mentions architectural details such as '4 layer Gated Graph Neural Networks (GGNN)' and 'a 2 hidden layer property predictor NN', and a weighting factor 'λ = 10'. However, it does not provide comprehensive experimental setup details like learning rates, batch sizes, specific optimizer settings, or number of epochs, which are crucial for full reproducibility. |