RetroBridge: Modeling Retrosynthesis with Markov Bridges
Authors: Ilia Igashov, Arne Schneuing, Marwin Segler, Michael M. Bronstein, Bruno Correia
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For all the experiments we use the USPTO-50k dataset (Schneider et al., 2016) which includes 50k reactions found in the US patent literature. We use standard train/validation/test splits provided by Dai et al. (2019). Somnath et al. (2021) report that the dataset contains a shortcut in that the product atom with atom-mapping 1 is part of the edit in almost 75% of the cases. Even though our model does not depend on the order of graph nodes, we utilize the dataset version with canonical SMILES provided by Somnath et al. (2021). Besides, we randomly permute graph nodes once SMILES are read and converted to graphs. Here we report top-k and round-trip accuracy for Retro Bridge and other state-of-the-art methods on the USPTO-50k test set. Table 1 provides exact match accuracy results, and Table 2 reports round-trip accuracy computed using Molecular Transformer (Schwaller et al., 2019). |
| Researcher Affiliation | Collaboration | 1 Ecole Polytechnique F ed erale de Lausanne, 2Microsoft Research, 3University of Oxford |
| Pseudocode | Yes | Algorithm 1 Training of the Markov Bridge Model and Algorithm 2 Sampling |
| Open Source Code | Yes | Our source code is available at https://github.com/igashov/Retro Bridge. |
| Open Datasets | Yes | For all the experiments we use the USPTO-50k dataset (Schneider et al., 2016) which includes 50k reactions found in the US patent literature. |
| Dataset Splits | Yes | We use standard train/validation/test splits provided by Dai et al. (2019). |
| Hardware Specification | Yes | We train our models on a single GPU Tesla V100-PCIE-32GB |
| Software Dependencies | No | We train our models on a single GPU Tesla V100-PCIE-32GB using Adam W optimizer (Loshchilov & Hutter, 2017) with learning rate 0.0002 and batch size 64. We trained models for up to 800 epochs (which takes 72 hours) and then selected the best checkpoints based on top-5 accuracy (that was computed on a subset of the USPTO-50k validation set). |
| Experiment Setup | Yes | In all experiments, we use the cosine schedule (Nichol & Dhariwal, 2021) αt = cos 0.5π t/T + s with s = 0.008 and number of time steps T = 500. We train our models on a single GPU Tesla V100-PCIE-32GB using Adam W optimizer (Loshchilov & Hutter, 2017) with learning rate 0.0002 and batch size 64. |