LIMO: Latent Inceptionism for Targeted Molecule Generation

Authors: Peter Eckmann, Kunyang Sun, Bo Zhao, Mudong Feng, Michael Gilson, Rose Yu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Comprehensive experiments show that LIMO performs competitively on benchmark tasks and markedly outperforms stateof-the-art techniques on the novel task of generating drug-like compounds with high binding affinity, reaching nanomolar range against two protein targets.
Researcher Affiliation Academia 1Department of Computer Science and Engineering, UC San Diego, La Jolla, California, United States 2Department of Chemistry and Biochemistry, UC San Diego, La Jolla, California, United States 3Skaggs School of Pharmacy and Pharmaceutical Sciences, UC San Diego, La Jolla, California, United States.
Pseudocode Yes Algorithm 1 Molecule fine-tuning algorithm.
Open Source Code Yes Code is available at https://github.com/ Rose-STL-Lab/LIMO.
Open Datasets Yes For all optimization tasks, we use the benchmark ZINC250k dataset, which contains 250,000 purchasable, drug-like molecules (Irwin et al., 2012). ... For the random generation task, we train on the ZINC-based 2 million molecule MOSES dataset (Polykovskiy et al., 2020).
Dataset Splits Yes 100k training examples were used for all properties except binding affinity, where 10k were used due to speed concerns. ... an unseen test set of 1,000 molecules was generated using the VAE and used to test the prediction performance of the property predictor.
Hardware Specification Yes All experiments, including baselines, were run on two GTX 1080 Ti GPUs, one for running Py Torch code and the other for running Auto Dock-GPU, and 4 CPU cores with 32 GB memory.
Software Dependencies Yes For the VAE, we use... The VAE is trained... using the Adam optimizer. ... To generate docking scores from a SMILES string produced by LIMO, we perform the following steps: ... we convert it to a 3D .pdbqt file using obabel 2.4.0 (O'Boyle et al., 2011). We run Auto Dock-GPU (Santos-Martins et al., 2021).
Experiment Setup Yes For the VAE, we use a 64-dimensional embedding layer that feeds into four batch-normalized 1,000-dimensional (2,000 for first layer) linear layers with Re LU activation. This generates a Gaussian output for the 1024-dimensional latent space... The VAE is trained over 18 epochs with a learning rate of 0.0001 using the Adam optimizer. For the property predictor, we use three 1,000-dimensional linear layers with Re LU activation... trained over 5 epochs, then perform backward optimization with a learning rate of 0.1 for 10 epochs.