Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer
Authors: Yue Wan, Chang-Yu Hsieh, Ben Liao, Shengyu Zhang
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our model can improve over the vanilla Transformer by 12.5% and 14.4% top-10 accuracy in the reaction class known and unknown settings, respectively. It reaches the new state-of-the-art accuracy for template-free methods and is competitive against both template-based and semi-template-based methods. It also enjoys better molecule and reaction validity compared to strong baseline models. |
| Researcher Affiliation | Industry | 1Tencent Quantum Laboratory, Shenzhen, China. Correspondence to: Chang-Yu Hsieh <kimhsieh@tencent.com>, Shengyu Zhang <shengyzhang@tencent.com>. |
| Pseudocode | Yes | Algorithm 1 SMILES Graph Construction; Algorithm 2 SMILES Token Alignment Computation; Algorithm 3 Reaction Center Subgraph Search |
| Open Source Code | Yes | Our code is available at https://github.com/yuewan2/Retroformer. |
| Open Datasets | Yes | We use the conventional retrosynthesis benchmark dataset USPTO-50K (Schneider et al., 2016) to evaluate our method. |
| Dataset Splits | Yes | We use the conventional retrosynthesis benchmark dataset USPTO-50K (Schneider et al., 2016) to evaluate our method. It contains 50016 atom-mapped reactions that are grouped into 10 reaction classes. We use the same data split as (Coley et al., 2017). |
| Hardware Specification | Yes | Retroformerbase is trained on 1 NVIDIA Tesla V100 GPU for 24 hours. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and being built on top of the vanilla Transformer, and it trains a vanilla retrosynthesis Transformer from scratch using Open NMT (Klein et al., 2017), but it does not specify software dependencies like Python, PyTorch/TensorFlow versions, or other library versions for its own implementation of Retroformer. |
| Experiment Setup | Yes | The model is trained using the Adam optimizer (Kingma & Ba, 2017) with a fixed learning rate of 1e 4, and a dropout rate of 0.3. The embedding dimension is set to 256, and the total amount of heads is set to 8. We split the heads by half for global and local heads. |