Population-Based Black-Box Optimization for Biological Sequence Design

Authors: Christof Angermueller, David Belanger, Andreea Gane, Zelda Mariet, David Dohan, Kevin Murphy, Lucy Colwell, D Sculley

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate P3BO and Adaptive-P3BO empirically on over 100 batched black-box optimization problems, and show that P3BO and Adaptive-P3BO are considerably more robust, generate more diverse batches of sequences, and find distinct optima faster than any single method in their population. Adaptive-P3BO improves upon P3BO results, and furthermore is able to recover from a poor initial population of methods.
Researcher Affiliation Collaboration 1Google Research 2University of Cambridge. Correspondence to: Christof Angermueller <christofa@google.com>.
Pseudocode Yes Algorithm 1 P3BO Algorithm 2 Adaptation of population members
Open Source Code No The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes Tf Bind8: Barrera et al. (2016) measured the binding activity... Tf Bind10: (Le et al., 2018) provides neural network predicted estimates... UTR: Sample et al. (2019) introduced a CNN... Pfam HMM: Pfam (El-Gebali et al., 2018) is a widely-used database... Protein Distance: Bileschi et al. (2019) introduced a CNN... PDBIsing: As introduced in Angermueller et al. (2020)...
Dataset Splits No The paper mentions evaluating methods on 'in-silico optimization tasks' and providing 'initial dataset' or 'random subset of sequences' to optimizers. However, it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) as commonly used for supervised learning models.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes We used a decay factor of γ = 0.25 for computing credit scores, and a softmax temperature of τ = 1.0 for computing selection probabilities. For Adaptive P3BO, we use a quantile cutoff of q = 0.5 for selecting the pool of survivors S, a recombination rate of 0.1, and a mutation rate of 0.5.