BOSS: Bayesian Optimization over String Spaces
Authors: Henry Moss, David Leslie, Daniel Beck, Javier González, Paul Rayson
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now evaluate our proposed BO framework on tasks from a range of fields and syntactical constraints. Our code is available at github.com/henrymoss/BOSS and is built upon the Emukit Python package [Paleyes et al., 2019]. All results are based on runs across 15 random seeds, showing the mean and a single standard error of the best objective value found as we increase the optimization budget. |
| Researcher Affiliation | Collaboration | Henry B. Moss STOR-i Centre for Doctoral Training Lancaster University, UK h.moss@lancaster.ac.uk Daniel Beck Computing and Information Systems University of Melbourne, Australia d.beck@unimelb.edu.au Javier González Microsoft Research Cambridge, UK David S. Leslie Dept. of Mathematics and Statistics Lancaster University, UK Paul Rayson School of Computing and Communications Lancaster University, UK |
| Pseudocode | No | The paper describes algorithms in text and through figures but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at github.com/henrymoss/BOSS and is built upon the Emukit Python package [Paleyes et al., 2019]. |
| Open Datasets | Yes | We replicate the symbolic regression example of Kusner et al. [2017], using their provided VAEs pre-trained for this exact problem. ...large collection of 250, 000 candidate molecules used by Kusner et al. [2017]... |
| Dataset Splits | No | The paper discusses training and testing for different models but does not provide explicit details on train/validation/test dataset splits (percentages or counts) for its own experiments. |
| Hardware Specification | Yes | Although acquisition function calculations could be parallelized across the populations of our GA at each BO step, we use a single-core Intel Xeon 2.30GHz processor to paint a clear picture of computational cost. |
| Software Dependencies | No | The paper mentions building upon the 'Emukit Python package' but does not provide specific version numbers for Emukit or Python, which are necessary for full reproducibility of software dependencies. |
| Experiment Setup | Yes | All results are based on runs across 15 random seeds, showing the mean and a single standard error of the best objective value found as we increase the optimization budget. ... After a random initialization of min(5, |Σ|) evaluations, kernel parameters are re-estimated to maximize model likelihood before each BO step. ... Our genetic algorithms (ga) limited to 100 evolutions of a population of size 100. |