Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Retroformer: Pushing the Limits of End-to-end Retrosynthesis Transformer

Authors: Yue Wan, Chang-Yu Hsieh, Ben Liao, Shengyu Zhang

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our model can improve over the vanilla Transformer by 12.5% and 14.4% top-10 accuracy in the reaction class known and unknown settings, respectively. It reaches the new state-of-the-art accuracy for template-free methods and is competitive against both template-based and semi-template-based methods. It also enjoys better molecule and reaction validity compared to strong baseline models.
Researcher Affiliation Industry 1Tencent Quantum Laboratory, Shenzhen, China. Correspondence to: Chang-Yu Hsieh <EMAIL>, Shengyu Zhang <EMAIL>.
Pseudocode Yes Algorithm 1 SMILES Graph Construction; Algorithm 2 SMILES Token Alignment Computation; Algorithm 3 Reaction Center Subgraph Search
Open Source Code Yes Our code is available at https://github.com/yuewan2/Retroformer.
Open Datasets Yes We use the conventional retrosynthesis benchmark dataset USPTO-50K (Schneider et al., 2016) to evaluate our method.
Dataset Splits Yes We use the conventional retrosynthesis benchmark dataset USPTO-50K (Schneider et al., 2016) to evaluate our method. It contains 50016 atom-mapped reactions that are grouped into 10 reaction classes. We use the same data split as (Coley et al., 2017).
Hardware Specification Yes Retroformerbase is trained on 1 NVIDIA Tesla V100 GPU for 24 hours.
Software Dependencies No The paper mentions using the Adam optimizer and being built on top of the vanilla Transformer, and it trains a vanilla retrosynthesis Transformer from scratch using Open NMT (Klein et al., 2017), but it does not specify software dependencies like Python, PyTorch/TensorFlow versions, or other library versions for its own implementation of Retroformer.
Experiment Setup Yes The model is trained using the Adam optimizer (Kingma & Ba, 2017) with a fixed learning rate of 1e 4, and a dropout rate of 0.3. The embedding dimension is set to 256, and the total amount of heads is set to 8. We split the heads by half for global and local heads.