RetroBridge: Modeling Retrosynthesis with Markov Bridges

Authors: Ilia Igashov, Arne Schneuing, Marwin Segler, Michael M. Bronstein, Bruno Correia

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For all the experiments we use the USPTO-50k dataset (Schneider et al., 2016) which includes 50k reactions found in the US patent literature. We use standard train/validation/test splits provided by Dai et al. (2019). Somnath et al. (2021) report that the dataset contains a shortcut in that the product atom with atom-mapping 1 is part of the edit in almost 75% of the cases. Even though our model does not depend on the order of graph nodes, we utilize the dataset version with canonical SMILES provided by Somnath et al. (2021). Besides, we randomly permute graph nodes once SMILES are read and converted to graphs. Here we report top-k and round-trip accuracy for Retro Bridge and other state-of-the-art methods on the USPTO-50k test set. Table 1 provides exact match accuracy results, and Table 2 reports round-trip accuracy computed using Molecular Transformer (Schwaller et al., 2019).
Researcher Affiliation Collaboration 1 Ecole Polytechnique F ed erale de Lausanne, 2Microsoft Research, 3University of Oxford
Pseudocode Yes Algorithm 1 Training of the Markov Bridge Model and Algorithm 2 Sampling
Open Source Code Yes Our source code is available at https://github.com/igashov/Retro Bridge.
Open Datasets Yes For all the experiments we use the USPTO-50k dataset (Schneider et al., 2016) which includes 50k reactions found in the US patent literature.
Dataset Splits Yes We use standard train/validation/test splits provided by Dai et al. (2019).
Hardware Specification Yes We train our models on a single GPU Tesla V100-PCIE-32GB
Software Dependencies No We train our models on a single GPU Tesla V100-PCIE-32GB using Adam W optimizer (Loshchilov & Hutter, 2017) with learning rate 0.0002 and batch size 64. We trained models for up to 800 epochs (which takes 72 hours) and then selected the best checkpoints based on top-5 accuracy (that was computed on a subset of the USPTO-50k validation set).
Experiment Setup Yes In all experiments, we use the cosine schedule (Nichol & Dhariwal, 2021) αt = cos 0.5π t/T + s with s = 0.008 and number of time steps T = 500. We train our models on a single GPU Tesla V100-PCIE-32GB using Adam W optimizer (Loshchilov & Hutter, 2017) with learning rate 0.0002 and batch size 64.