Molecular Optimization Model with Patentability Constraint

Authors: Sally Turutov, Kira Radinsky

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through empirical evaluation, we demonstrate the superior performance of our approach compared to state-of-the-art molecular optimization methods both in chemical property optimization and patentability. We empirically evaluate our proposed model on numerous molecule optimization tasks, demonstrating its ability to maintain similarity and optimize properties while considering patent constraints. Our results show that our model successfully reduces the similarity of optimized molecules to existing patents while still generating highly optimized molecules, thus outperforming the state-of-the-art (SOTA) models. Additionally, comprehensive ablation experiments provide detailed insights into the effectiveness of our approach and its individual components.
Researcher Affiliation Academia Sally Turutov, Kira Radinsky Technion Israel Institute of Technology turutovsally@campus.technion.ac.il, kirar@cs.technion.ac.il
Pseudocode Yes Algorithm 1 METN Training Algorithm, Algorithm 2 EETN Training Algorithm, Algorithm 3 Extended-EETN Training Algorithm, Algorithm 4 End-to-End Training Algorithm
Open Source Code Yes To facilitate further research and exploration of the problem, we provide the community with access to our code and data through the following link: https://github.com/Sally Turutov/MOMP.
Open Datasets Yes We utilized datasets from (Jin et al. 2019). ... The Sure Ch EMBL dataset (Papadatos et al. 2016) focuses on patent compounds, providing Maximum Common Substructures (MCSs) representing shared core chemical structures within a patent.
Dataset Splits No The paper mentions using 'training sets of molecules' and partitioning data into domain-specific sets (A, B, C), and using 'original datasets for training and testing' for baseline models. However, it does not provide specific percentages, counts, or explicit instructions for train/validation/test splits for its own experimental setup with the A, B, C domains.
Hardware Specification No No specific hardware specifications (e.g., GPU models, CPU types, or memory details) used for running the experiments were provided in the paper.
Software Dependencies No The paper mentions the use of the Adam optimizer but does not specify version numbers for any software dependencies or libraries used in the implementation.
Experiment Setup Yes We employ the Adam optimizer with a learning rate of 3 10 4, a mini-batch size of 32, and set maximum epochs Emax T rain to 12 for QED and 18 for DRD2. The regularization parameters are λAB = λBC = λAC = 2.