Searching for High-Value Molecules Using Reinforcement Learning and Transformers
Authors: Raj Ghugare, Santiago Miret, Adriana Hugessen, Mariano Phielipp, Glen Berseth
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, we explore how different design choices for text grammar and algorithmic choices for training can affect an RL policy s ability to generate molecules with desired properties. We arrive at a new RL-based molecular design algorithm (Chem RLformer) and perform a thorough analysis using 25 molecule design tasks, including computationally complex protein docking simulations. |
| Researcher Affiliation | Collaboration | Raj Ghugare1,2 Santiago Miret 3 Adriana Hugessen1,2 Mariano Phielipp3 Glen Berseth1,2 1Universit e de Montr eal 2Mila Quebec AI Institute 3Intel Labs |
| Pseudocode | No | The paper describes algorithmic details (e.g., policy gradient algorithm, reward functions) but does not include any explicit pseudocode blocks or algorithms labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | Upon acceptance, we will open-source our code and release the pretrained weights to support reproducible research. |
| Open Datasets | Yes | To advance effectively within this vast search space, we make use of datasets containing a large number of drug-like molecules in text format (Irwin et al., 2012; Sterling and Irwin, 2015b; Mendez et al., 2019). |
| Dataset Splits | No | The paper mentions 'validation loss' ('On the ZINC 250K SMILES dataset, the FC, the RNN and the transformer model achieved a validation loss of 29.417, 22.507, and 22.923 respectively.') but does not specify the explicit proportions or sizes of the training, validation, or test dataset splits. |
| Hardware Specification | Yes | All transformers were trained for 5 epochs, with the largest batch size that we could fit in the memory of a single NVIDIA RTX A6000 GPU, for example, a batch size of 2048 for pretraining the transformer on ZINC 100M dataset. |
| Software Dependencies | Yes | For experiments that apply SELFIES, we convert all datasets to SELFIES using the Python API provided by (Krenn et al., 2020) (Version: 2.1.1). |
| Experiment Setup | Yes | All models used an initial learning rate of 1e 3, with a cosine learning rate schedule (Loshchilov and Hutter, 2017). FC and RNNs used a batch size of 128 and were trained for 10 epochs. All transformers were trained for 5 epochs, with the largest batch size that we could fit in the memory of a single NVIDIA RTX A6000 GPU, for example, a batch size of 2048 for pretraining the transformer on ZINC 100M dataset. |