Multi-Objective Molecule Generation using Interpretable Substructures
Authors: Wengong Jin, Dr.Regina Barzilay, Tommi Jaakkola
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on various drug design tasks and demonstrate significant improvements over state-of-the-art baselines in terms of accuracy, diversity, and novelty of generated compounds. |
| Researcher Affiliation | Academia | Wengong Jin 1 Regina Barzilay 1 Tommi Jaakkola 1 1MIT CSAIL. Correspondence to: Wengong Jin <wengong@casil.mit.edu>. |
| Pseudocode | Yes | Algorithm 1 Training method with n property constraints. |
| Open Source Code | Yes | 1https://github.com/wengong-jin/multiobj-rationale |
| Open Datasets | Yes | We pre-train all the models on the same ChEMBL dataset, which contains 1.02 million training examples. On the four-property generation task, our model is fine-tuned for L = 50 iterations, with each rationale being expanded for K = 200 times. Following Li et al. (2018), the property prediction model is a random forest using Morgan fingerprint features (Rogers & Hahn, 2010). |
| Dataset Splits | Yes | For each property, we split its property dataset into 80%, 10% and 10% for training, validation and testing. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions RDKit but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We set the positive threshold δi = 0.5. For each positive molecule, we run 20 iteration of MCTS with cpuct = 10. Our model is fine-tuned for L = 50 iterations, with each rationale being expanded for K = 200 times. |