Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Predicting Organic Reaction Outcomes with Weisfeiler-Lehman Network
Authors: Wengong Jin, Connor Coley, Regina Barzilay, Tommi Jaakkola
NeurIPS 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on two datasets derived from the USPTO [13], and compare our methods to the current top performing system [3]. Our method achieves 83.9% and 77.9% accuracy on two datasets, outperforming the baseline approach by 10%, while running 140 times faster. Finally, we demonstrate that the model outperforms domain experts by a large margin. |
| Researcher Affiliation | Academia | Wengong Jin Connor W. Coley Regina Barzilay Tommi Jaakkola Computer Science and Arti๏ฌcial Intelligence Lab, MIT Department of Chemical Engineering, MIT EMAIL, EMAIL |
| Pseudocode | No | The paper describes its methods using prose and mathematical equations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and data available at https://github.com/wengong-jin/nips17-rexgen |
| Open Datasets | Yes | As a source of data for our experiments, we used reactions from USPTO granted patents, collected by Lowe [13]. |
| Dataset Splits | Yes | This dataset is divided into 400K, 40K, and 40K for training, development, and testing purposes. We follow their split, with 10.5K, 1.5K, and 3K for training, development, and testing. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions that models are 'optimized with Adam [10]', but it does not specify version numbers for any software dependencies, libraries, or programming languages used in the implementation. |
| Experiment Setup | Yes | Both our local and global models are build upon a Weisfeiler-Lehman Network, with unrolled depth 3. All models are optimized with Adam [10], with learning rate decay factor 0.9. |