RetroOOD: Understanding Out-of-Distribution Generalization in Retrosynthesis Prediction
Authors: Yemin Yu, Luotian Yuan, Ying Wei, Hanyu Gao, Fei Wu, Zhihua Wang, Xinhai Ye
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To this end, we first formally sort out two types of distribution shifts in retrosynthesis prediction and construct two groups of benchmark datasets. Next, through comprehensive experiments, we systematically compare state-of-the-art retrosynthesis prediction models on the two groups of benchmarks, revealing the limitations of previous in-distribution evaluation and re-examining the advantages of each model. |
| Researcher Affiliation | Academia | Yemin Yu1,5*, Luotian Yuan2*, Ying Wei3 , Hanyu Gao 4, Fei Wu 2,5, Zhihua Wang 5, Xinhai Ye 5 1City University of Hong Kong 2Zhejiang University 3Nanyang Technological University 4Hong Kong University of Science and Technology 5Shanghai Institute for Advanced Study of Zhejiang University |
| Pseudocode | Yes | The complete algorithm is listed as Alg. 1 in the Appendix. |
| Open Source Code | No | The paper does not provide an explicit statement or a link to the open-source code for the methodology it describes. |
| Open Datasets | Yes | on the benchmark USPTO50K dataset (Schneider, Stiefl, and Landrum 2016) |
| Dataset Splits | No | The paper mentions "train-test data split" and discusses N/N as sample sizes for train/test data, but it does not explicitly specify the proportions or existence of a validation set, nor does it provide concrete percentages or sample counts for the splits needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running its experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., Python, PyTorch, specific libraries) that would be needed to replicate the experiments. |
| Experiment Setup | No | The paper mentions that "All baseline models are re-trained on each of the four OOD datasets separately for evaluation" and "The complete architecture details of the EBM and the ablation study on the different settings of n are elaborated in the Appendix," implying that specific hyperparameter values or detailed training configurations are not present in the main text. |