reproducibilityindex.ai

Entropy-Reinforced Planning with Large Language Models for Drug Discovery

Authors: Xuefeng Liu, Chih-Chan Tien, Peng Ding, Songhao Jiang, Rick L. Stevens

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated ERP on the SARS-Co V2 virus (3CLPro) and human cancer cell target protein (RTCB) benchmarks and demonstrated that, in both benchmarks, ERP consistently outperforms the current state-of-the-art algorithm by 1-5 percent, and baselines by 5-10 percent, respectively. Moreover, such improvement is robust across Transformer models trained with different objectives. Finally, to further illustrate the capabilities of ERP, we tested our algorithm on three code generation benchmarks and outperformed the current state-of-the-art approach as well. Our code is publicly available at: https: //github.com/xuefeng-cs/ERP.
Researcher Affiliation	Collaboration	1Department of Computer Science, University of Chicago, Chicago, IL, USA 2Healin-AI LLC, Chicago, IL, USA 3Argonne National Laboratory, Lemont, IL, USA.
Pseudocode	Yes	We present the pseudocode of our algorithm in Algorithm 1 and detail the entire process in the following.
Open Source Code	Yes	Our code is publicly available at: https: //github.com/xuefeng-cs/ERP.
Open Datasets	Yes	The pretrained model is a 124M GPT2 model trained using a BPE tokenizer (Bostrom and Durrett, 2020) on a diverse dataset of 10.7 million SMILES strings of drug-like molecules randomly sampled from the ZINC database (Irwin and Shoichet, 2005).
Dataset Splits	No	The paper mentions training models on datasets like ZINC and ZINC15, and performing evaluation, but does not explicitly provide the specific percentages or counts for train, validation, and test splits used for its experiments.
Hardware Specification	Yes	We trained our docking surrogate models using four nodes of the supercomputer where each node contains one 64-core CPU and four A100 GPU nodes (Facility).
Software Dependencies	No	The paper mentions using GPT-2 models and a BPE tokenizer, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries used in the implementation.
Experiment Setup	Yes	We did limited hyperparameter search based on the possible values in the range shown in Table 6. Then we use the same hyperparameter to compare different algorithms. Table 6: Hyperparameters of possible values.