Biases in Evaluation of Molecular Optimization Methods and Bias Reduction Strategies

Authors: Hiroshi Kajino, Kohei Miyaguchi, Takayuki Osogami

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Let us empirically quantify the two biases as well as the effectiveness of the bias reduction methods. We employ a reinforcement learning setting as a case study. The code used in our empirical studies will be available in https://github.com/ kanojikajino/biases-in-mol-opt.
Researcher Affiliation Industry 1IBM Research Tokyo, Tokyo, Japan. Correspondence to: Hiroshi Kajino <kajino@jp.ibm.com>.
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code used in our empirical studies will be available in https://github.com/ kanojikajino/biases-in-mol-opt.
Open Datasets Yes In specific, we used the predictor provided by Gottipati et al. (2020), which was trained on the Ch EMBL database (Gaulton et al., 2017) to predict p IC50 value associated with C-C chemokine receptor type 5 (CCR5).
Dataset Splits No The paper mentions generating 'train and test sets' and using a 'large sample Dtest of size 10^5' for approximating the true property function, but it does not explicitly specify a separate validation dataset split or its size.
Hardware Specification Yes We used an IBM Cloud with 16 2.10GHz CPU cores, 128GB memory, and two NVIDIA Tesla P100 GPUs.
Software Dependencies Yes We implement the whole simulation in Python 3.9.0. All of the chemistry-related operations including the template-based chemical reaction is implemented by RDKit (2021.09.3).
Experiment Setup Yes For the first 500 steps, we only update the parameters of the critic, fixing those of the actor, and after that, both of them are updated for another 1,500 steps. They are optimized by Ada Grad (Duchi et al., 2011) with initial learning rate 4 10 4 and batch size 64.