Entropy-Reinforced Planning with Large Language Models for Drug Discovery

Authors: Xuefeng Liu, Chih-Chan Tien, Peng Ding, Songhao Jiang, Rick L. Stevens

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluated ERP on the SARS-Co V2 virus (3CLPro) and human cancer cell target protein (RTCB) benchmarks and demonstrated that, in both benchmarks, ERP consistently outperforms the current state-of-the-art algorithm by 1-5 percent, and baselines by 5-10 percent, respectively. Moreover, such improvement is robust across Transformer models trained with different objectives. Finally, to further illustrate the capabilities of ERP, we tested our algorithm on three code generation benchmarks and outperformed the current state-of-the-art approach as well. Our code is publicly available at: https: //github.com/xuefeng-cs/ERP.
Researcher Affiliation Collaboration 1Department of Computer Science, University of Chicago, Chicago, IL, USA 2Healin-AI LLC, Chicago, IL, USA 3Argonne National Laboratory, Lemont, IL, USA.
Pseudocode Yes We present the pseudocode of our algorithm in Algorithm 1 and detail the entire process in the following.
Open Source Code Yes Our code is publicly available at: https: //github.com/xuefeng-cs/ERP.
Open Datasets Yes The pretrained model is a 124M GPT2 model trained using a BPE tokenizer (Bostrom and Durrett, 2020) on a diverse dataset of 10.7 million SMILES strings of drug-like molecules randomly sampled from the ZINC database (Irwin and Shoichet, 2005).
Dataset Splits No The paper mentions training models on datasets like ZINC and ZINC15, and performing evaluation, but does not explicitly provide the specific percentages or counts for train, validation, and test splits used for its experiments.
Hardware Specification Yes We trained our docking surrogate models using four nodes of the supercomputer where each node contains one 64-core CPU and four A100 GPU nodes (Facility).
Software Dependencies No The paper mentions using GPT-2 models and a BPE tokenizer, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries used in the implementation.
Experiment Setup Yes We did limited hyperparameter search based on the possible values in the range shown in Table 6. Then we use the same hyperparameter to compare different algorithms. Table 6: Hyperparameters of possible values.