reproducibilityindex.ai

Molecular Optimization Model with Patentability Constraint

Authors: Sally Turutov, Kira Radinsky

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through empirical evaluation, we demonstrate the superior performance of our approach compared to state-of-the-art molecular optimization methods both in chemical property optimization and patentability. We empirically evaluate our proposed model on numerous molecule optimization tasks, demonstrating its ability to maintain similarity and optimize properties while considering patent constraints. Our results show that our model successfully reduces the similarity of optimized molecules to existing patents while still generating highly optimized molecules, thus outperforming the state-of-the-art (SOTA) models. Additionally, comprehensive ablation experiments provide detailed insights into the effectiveness of our approach and its individual components.
Researcher Affiliation	Academia	Sally Turutov, Kira Radinsky Technion Israel Institute of Technology turutovsally@campus.technion.ac.il, kirar@cs.technion.ac.il
Pseudocode	Yes	Algorithm 1 METN Training Algorithm, Algorithm 2 EETN Training Algorithm, Algorithm 3 Extended-EETN Training Algorithm, Algorithm 4 End-to-End Training Algorithm
Open Source Code	Yes	To facilitate further research and exploration of the problem, we provide the community with access to our code and data through the following link: https://github.com/Sally Turutov/MOMP.
Open Datasets	Yes	We utilized datasets from (Jin et al. 2019). ... The Sure Ch EMBL dataset (Papadatos et al. 2016) focuses on patent compounds, providing Maximum Common Substructures (MCSs) representing shared core chemical structures within a patent.
Dataset Splits	No	The paper mentions using 'training sets of molecules' and partitioning data into domain-specific sets (A, B, C), and using 'original datasets for training and testing' for baseline models. However, it does not provide specific percentages, counts, or explicit instructions for train/validation/test splits for its own experimental setup with the A, B, C domains.
Hardware Specification	No	No specific hardware specifications (e.g., GPU models, CPU types, or memory details) used for running the experiments were provided in the paper.
Software Dependencies	No	The paper mentions the use of the Adam optimizer but does not specify version numbers for any software dependencies or libraries used in the implementation.
Experiment Setup	Yes	We employ the Adam optimizer with a learning rate of 3 10 4, a mini-batch size of 32, and set maximum epochs Emax T rain to 12 for QED and 18 for DRD2. The regularization parameters are λAB = λBC = λAC = 2.