Bidirectional Learning for Offline Model-based Biological Sequence Design

Authors: Can Chen, Yingxue Zhang, Xue Liu, Mark Coates

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on DNA and protein design tasks, and aim to answer three research questions: (1) How does BIB compare with state-of-the-art algorithms? (2) Is every design component necessary in BIB? (3) Does the Adaptive-η module improve gradient-based methods?
Researcher Affiliation Collaboration 1Mc Gill University 2Mila Quebec AI Institute 3Huawei Noah s Ark Lab. Correspondence to: Can (Sam) Chen <can.chen@mila.quebec>.
Pseudocode Yes Algorithm 1 Bidirectional Learning for Offline Modelbased Biological Sequence Design
Open Source Code Yes Our code is available here. (In the Introduction) We provide the code implementation of BIB and Adaptiveη here and we also attach the code in the supplementary material. (In Appendix 7.10 Reproducibility Statement)
Open Datasets Yes We conduct experiments on two DNA tasks: TFBind8(r) and TFBind10(r), following (Chen et al., 2022c) and three protein tasks: av GFP, AAV and E4B, in (Ren et al., 2022) which have the most data points. See Appendix 7.3 for more details on task definitions and oracle evaluations.
Dataset Splits No Since the offline nature prohibits standard cross-validation strategies for hyperparameter tuning, all current gradientbased offline model-based algorithms preset the learning rate η.
Hardware Specification Yes All experiments are performed on one NVIDIA 32G V100 in the same cluster.
Software Dependencies No We use Pytorch (Paszke et al., 2019) to run all experiments on one V100 GPU. The paper mentions PyTorch but does not specify its version number or any other software dependencies with version numbers.
Experiment Setup Yes We set the number of iterations T as 25 for all experiments following (Norn et al., 2021) and η0 as 0.1 following (Chen et al., 2022c). We choose OPT as the Adam optimizer (Kingma & Ba, 2015) for all gradient-based methods.