LIMO: Latent Inceptionism for Targeted Molecule Generation
Authors: Peter Eckmann, Kunyang Sun, Bo Zhao, Mudong Feng, Michael Gilson, Rose Yu
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments show that LIMO performs competitively on benchmark tasks and markedly outperforms stateof-the-art techniques on the novel task of generating drug-like compounds with high binding affinity, reaching nanomolar range against two protein targets. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, UC San Diego, La Jolla, California, United States 2Department of Chemistry and Biochemistry, UC San Diego, La Jolla, California, United States 3Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, California, United States. |
| Pseudocode | Yes | Algorithm 1 Molecule fine-tuning algorithm. |
| Open Source Code | Yes | Code is available at https://github.com/ Rose-STL-Lab/LIMO. |
| Open Datasets | Yes | For all optimization tasks, we use the benchmark ZINC250k dataset, which contains 250,000 purchasable, drug-like molecules (Irwin et al., 2012). ... For the random generation task, we train on the ZINC-based 2 million molecule MOSES dataset (Polykovskiy et al., 2020). |
| Dataset Splits | Yes | 100k training examples were used for all properties except binding affinity, where 10k were used due to speed concerns. ... an unseen test set of 1,000 molecules was generated using the VAE and used to test the prediction performance of the property predictor. |
| Hardware Specification | Yes | All experiments, including baselines, were run on two GTX 1080 Ti GPUs, one for running Py Torch code and the other for running Auto Dock-GPU, and 4 CPU cores with 32 GB memory. |
| Software Dependencies | Yes | For the VAE, we use... The VAE is trained... using the Adam optimizer. ... To generate docking scores from a SMILES string produced by LIMO, we perform the following steps: ... we convert it to a 3D .pdbqt file using obabel 2.4.0 (O'Boyle et al., 2011). We run Auto Dock-GPU (Santos-Martins et al., 2021). |
| Experiment Setup | Yes | For the VAE, we use a 64-dimensional embedding layer that feeds into four batch-normalized 1,000-dimensional (2,000 for first layer) linear layers with Re LU activation. This generates a Gaussian output for the 1024-dimensional latent space... The VAE is trained over 18 epochs with a learning rate of 0.0001 using the Adam optimizer. For the property predictor, we use three 1,000-dimensional linear layers with Re LU activation... trained over 5 epochs, then perform backward optimization with a learning rate of 0.1 for 10 epochs. |