Doubly Stochastic Graph-based Non-autoregressive Reaction Prediction
Authors: Ziqiao Meng, Peilin Zhao, Yang Yu, Irwin King
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive empirical results on the open benchmark dataset USPTOMIT demonstrate that our approach consistently outperforms baseline non-autoregressive reaction prediction models. We conduct experiments on three different splits of reaction prediction, which are random split, tanimoto-0.4 split and tanimoto-0.6 split. For original random split, the training set, validation set and testing set follows split ratio 409K:30K:40K. |
| Researcher Affiliation | Collaboration | Ziqiao Meng 1 , Peilin Zhao 2 , Yang Yu 2 and Irwin King 1 1The Chinese University of Hong Kong 2Tencent AI Lab {zqmeng, king}@cse.cuhk.edu.hk, mazonzhao@tencent.com, kevinyyu@tencent.com |
| Pseudocode | No | The paper describes algorithms but does not include a figure, block, or section explicitly labeled "Pseudocode" or "Algorithm". |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing the source code for the work described in this paper, nor does it provide a direct link to a source-code repository for their method. |
| Open Datasets | Yes | Following previous work, we evaluate our approach on the open public benchmark dataset USPTO-MIT [Jin et al., 2017], which contains 479K reactions filtered by removing duplicates and erroneous reactions from Lowe’s original data [Lowe, 2012]. |
| Dataset Splits | Yes | For original random split, the training set, validation set and testing set follows split ratio 409K:30K:40K. For scaffold split, the split ratio is 392K:30K:50K. |
| Hardware Specification | Yes | Finally, we train our Reaction Sink for 100 epochs with a batch size of 128 using 8 Nvidia V100 GPUs in this work. |
| Software Dependencies | No | The model is optimized using Adam optimizer [Kingma and Ba, 2015] at learning rate 10 4 with linear warm-up and linear learning rate decay. However, no specific version numbers for software libraries or frameworks (e.g., PyTorch, TensorFlow, Python) are provided. |
| Experiment Setup | Yes | To ensure fair comparison with major baseline models Molecular Transformer and NERF, we set the number of transformer encoder layers and transformer decoder layers (cross-attention layer) to be 4, the same as previous work. And we set the dimension of latent embedding to be 256. For multi-head attention decoder, Bond Formation and Bond Breaking both have 4 attention heads. The model is optimized using Adam optimizer [Kingma and Ba, 2015] at learning rate 10 4 with linear warm-up and linear learning rate decay. The number of iterations l of Sinkhorn normalization is also an hyperparameter to fine-tune. Finally, we train our Reaction Sink for 100 epochs with a batch size of 128 using 8 Nvidia V100 GPUs in this work. |