RetroXpert: Decompose Retrosynthesis Prediction Like A Chemist

Authors: Chaochao Yan, Qianggang Ding, Peilin Zhao, Shuangjia Zheng, JINYU YANG, Yang Yu, Junzhou Huang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method on USPTO-50K [19] and USPTO-full [25] to verify its effectiveness and scalability. ...Our method Retro Xpert achieves impressive performance on the test data. ...Experimental results are reported at the bottom of Table 3.
Researcher Affiliation Collaboration Chaochao Yan University of Texas at Arlington chaochao.yan@mavs.uta.edu Qianggang Ding Tsinghua University dqg18@mails.tsinghua.edu.cn Peilin Zhao Tencent AI Lab masonzhao@tencent.com Shuangjia Zheng Sun Yat-sen University zhengshj9@mail2.sysu.edu.cn Jinyu Yang University of Texas at Arlington jinyu.yang@mavs.uta.edu Yang Yu Tencent AI Lab kevinyyu@tencent.com Junzhou Huang University of Texas at Arlington jzhuang@uta.edu
Pseudocode No The paper describes the architecture and functionality of its models (EGAT, RGN) using mathematical equations and textual explanations. However, it does not include any explicitly labeled pseudocode blocks or algorithms in a structured format.
Open Source Code Yes Code and processed USPTO-full data are available at https://github.com/uta-smile/Retro Xpert
Open Datasets Yes We evaluate our method on USPTO-50K [19] and USPTO-full [25] to verify its effectiveness and scalability.
Dataset Splits Yes We adopt the same training/validation/test splits in 8:1:1 as [12, 5].
Hardware Specification Yes We train the RGN for 300, 000 time steps, and it takes about 30 hours on two GTX 1080 Ti GPUs.
Software Dependencies No The paper mentions software used such as 'DGL [30]', 'Open NMT [33]', and 'RDKit 4'. However, it does not specify version numbers for these software components, which is required for a reproducible description of ancillary software.
Experiment Setup Yes As for the EGAT, we stack three identical four-head attentive layers of which the hidden dimension is 128. All embedding sizes in EGAT are set to 128, such as F, F , and D. The Nmax is set to be two to cover 99.97% training samples. We train the EGAT on USPTO-50K for 80 epochs. EGAT parameters are optimized with Adam [34] with default settings, and the initial learning rate is 0.0005 and it is scheduled to multiply 0.2 every 20 epochs.