reproducibilityindex.ai

A Graph to Graphs Framework for Retrosynthesis Prediction

Authors: Chence Shi, Minkai Xu, Hongyu Guo, Ming Zhang, Jian Tang

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that G2Gs signiﬁcantly outperforms existing template-free approaches by up to 63% in terms of the top-1 accuracy and achieves a performance close to that of state-of-the-art templatebased approaches, but does not require domain knowledge and is much more scalable.
Researcher Affiliation	Academia	1Department of Computer Science, School of EECS, Peking University 2Shanghai Jiao Tong University 3National Research Council Canada 4Montr eal Institute for Learning Algorithms (MILA) 5Canadian Institute for Advanced Research (CIFAR) 6HEC Montr eal. Correspondence to: Chence Shi <chenceshi@pku.edu.cn>, Jian Tang <jian.tang@hec.ca>.
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include an unambiguous statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We evaluate our approach on the widely used benchmark data set USPTO-50k, which contains 50k atommapped reactions with 10 reaction types.
Dataset Splits	Yes	Following (Liu et al., 2017), we randomly select 80% of the reactions as training set and divide the rest into validation and test sets with equal size.
Hardware Specification	Yes	We train our G2Gs for 100 epochs with a batch size of 128 and a learning rate of 0.0001 with Adam (Kingma & Ba, 2014) optimizer on a single GTX 1080Ti GPU card.
Software Dependencies	No	G2Gs is implemented in Py-torch (Paszke et al., 2017). We use the open-source chemical software RDkit (Landrum, 2016) to preprocess molecules for the training and generate canonical SMILES strings for the evaluation. Specific version numbers for PyTorch and RDKit are not provided.
Experiment Setup	Yes	The R-GCN in G2Gs is implemented with 4 layers and the embedding size is set as 512 for both modules. We use latent codes of dimension \|z\| = 10. We train our G2Gs for 100 epochs with a batch size of 128 and a learning rate of 0.0001 with Adam (Kingma & Ba, 2014) optimizer on a single GTX 1080Ti GPU card. The λ is set as 20 for reaction center identiﬁcation module, and the beam size is 10 during inference. The maximal number of transformation steps is set as 20.