Doubly Stochastic Graph-based Non-autoregressive Reaction Prediction

Authors: Ziqiao Meng, Peilin Zhao, Yang Yu, Irwin King

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive empirical results on the open benchmark dataset USPTOMIT demonstrate that our approach consistently outperforms baseline non-autoregressive reaction prediction models. We conduct experiments on three different splits of reaction prediction, which are random split, tanimoto-0.4 split and tanimoto-0.6 split. For original random split, the training set, validation set and testing set follows split ratio 409K:30K:40K.
Researcher Affiliation Collaboration Ziqiao Meng 1 , Peilin Zhao 2 , Yang Yu 2 and Irwin King 1 1The Chinese University of Hong Kong 2Tencent AI Lab {zqmeng, king}@cse.cuhk.edu.hk, mazonzhao@tencent.com, kevinyyu@tencent.com
Pseudocode No The paper describes algorithms but does not include a figure, block, or section explicitly labeled "Pseudocode" or "Algorithm".
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing the source code for the work described in this paper, nor does it provide a direct link to a source-code repository for their method.
Open Datasets Yes Following previous work, we evaluate our approach on the open public benchmark dataset USPTO-MIT [Jin et al., 2017], which contains 479K reactions filtered by removing duplicates and erroneous reactions from Lowe’s original data [Lowe, 2012].
Dataset Splits Yes For original random split, the training set, validation set and testing set follows split ratio 409K:30K:40K. For scaffold split, the split ratio is 392K:30K:50K.
Hardware Specification Yes Finally, we train our Reaction Sink for 100 epochs with a batch size of 128 using 8 Nvidia V100 GPUs in this work.
Software Dependencies No The model is optimized using Adam optimizer [Kingma and Ba, 2015] at learning rate 10 4 with linear warm-up and linear learning rate decay. However, no specific version numbers for software libraries or frameworks (e.g., PyTorch, TensorFlow, Python) are provided.
Experiment Setup Yes To ensure fair comparison with major baseline models Molecular Transformer and NERF, we set the number of transformer encoder layers and transformer decoder layers (cross-attention layer) to be 4, the same as previous work. And we set the dimension of latent embedding to be 256. For multi-head attention decoder, Bond Formation and Bond Breaking both have 4 attention heads. The model is optimized using Adam optimizer [Kingma and Ba, 2015] at learning rate 10 4 with linear warm-up and linear learning rate decay. The number of iterations l of Sinkhorn normalization is also an hyperparameter to fine-tune. Finally, we train our Reaction Sink for 100 epochs with a batch size of 128 using 8 Nvidia V100 GPUs in this work.