Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Reaction Prediction via Interaction Modeling of Symmetric Difference Shingle Sets
Authors: Runhan Shi, Letian Chen, Gufeng Yu, Yang Yang
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that Rea DISH improves reaction prediction performance across diverse benchmarks. It shows enhanced robustness with an average improvement of 8.76% on R2 under permutation perturbations. Section 4 Experiments |
| Researcher Affiliation | Academia | 1AGI Institute, School of Computer Science, Shanghai Jiao Tong University 2Shanghai Innovation Institute EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 Pipeline to extract symmetric difference shingles |
| Open Source Code | Yes | 1The code is available at https://github.com/Meteor-han/Rea DISH. |
| Open Datasets | Yes | We collect 3.7M chemical reactions for pre-training based on the United States Patent and Trademark Office (USPTO) dataset [38] and the Chemical Journals with High Impact Factor (CJHIF) dataset [39]. We use seven datasets across a wide range of chemical tasks, including: (1) yield prediction, the Buchwald-Hartwig (BH) dataset [13], the Suzuki-Miyaura (SM) dataset [14], the real-world electronic laboratory notebook (ELN) dataset [40], and the Ni-catalyzed C-O bond activation (Ni COlit) dataset [41]; (2) enantioselectivitiy prediction, the asymmetric N,S-acetal formation (N,S-acetal) dataset [42]; (3) conversion rate estimation, the C-heteroatom-coupling reactions (C-heteroatom) dataset [43]; and (4) reaction type classification, the USPTO_TPL dataset [8]. |
| Dataset Splits | Yes | To assess the generalizability of our approach, we consider both random and out-of-sample splits. In the out-of-sample split, the test set contains reactions involving molecules that do not appear in the training set. Table 3: The statistics of pre-training datasets (first row) and evaluation datasets (remaining rows). All methods are tested on (1) the same ten random splits and (2) the same out-of-sample split across five random runs to ensure fair comparisons, with the average results reported. |
| Hardware Specification | Yes | All experiments are executed on 4 NVIDIA RTX3090 GPUs. |
| Software Dependencies | No | We use Pytorch [62] with the Adam [63] optimizer and the cosine learning rate decay strategy for training. We apply K means clustering by scikit-learn [61] with different values of K. We remove duplicate records and invalid reactions for pre-training by RDKit [59]. |
| Experiment Setup | Yes | Table 5: Parameters during pre-training. Table 6: Search space of parameters during fine-tuning. We use Pytorch [62] with the Adam [63] optimizer and the cosine learning rate decay strategy for training. |