Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Practical Two-Step Lookahead Bayesian Optimization

Authors: Jian Wu, Peter Frazier

NeurIPS 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the value of our algorithm with extensive experiments on synthetic test functions and real-world problems.
Researcher Affiliation Collaboration Jian Wu EMAIL Peter I. Frazier School of Operations Research and Information Engineering Cornell University Ithaca, NY 14850 EMAIL Peter Frazier is also a Staff Data Scientist at Uber
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes An implementation is available in the Cornell-MOE codebase, https://github.com/wujian16/Cornell-MOE, and the code to replicate our experiments is available at https://github.com/wujian16/Two Step-Bayes Opt.
Open Datasets Yes First, we test 2-OPT and benchmark methods on 6 well-known synthetic test functions chosen from Bingham [2015] ranging from 2d to 10d: 2d Branin, 2d Camel, 5d Ackley5, 6d Hartmann6, 8d Cosine and 10d Levy. ... The HPOlib library was developed in Eggensperger et al. [2013] based on hyperparameter tuning benchmarks from Snoek et al. [2012]. ... The assemble-to-order (ATO) benchmark [Hong and Nelson, 2006, Poloczek et al., 2017] ... The robot pushing problem is a 14-dimensional reinforcement learning problem considered in Wang and Jegelka [2017].
Dataset Splits No The paper describes initiating algorithms by randomly sampling 3 points from a Latin hypercube design and then starting an iterative Bayesian optimization process. It does not provide traditional train/validation/test dataset splits as would be common in supervised learning experiments, as Bayesian optimization involves sequential evaluation rather than static dataset partitioning.
Hardware Specification Yes In the supplement Figure 4 shows the time required for acquisition function optimization on 1 core from a AWS t2.2xlarge instance for 2-OPT, EI, KG, and GLASSES.
Software Dependencies No We integrate over GP hyperparameters by sampling 16 sets of values using the emcee package [Foreman-Mackey et al., 2013]. An implementation is available in the Cornell-MOE codebase, https://github.com/wujian16/Cornell-MOE, and the code to replicate our experiments is available at https://github.com/wujian16/Two Step-Bayes Opt. (While specific packages/codebases are mentioned, their version numbers are not provided).
Experiment Setup Yes Following Snoek et al. [2012], we use a constant mean prior and the ARD Mat ern 5/2 kernel. We integrate over GP hyperparameters by sampling 16 sets of values using the emcee package [Foreman-Mackey et al., 2013]. We initiate our algorithms by randomly sampling 3 points from a Latin hypercube design and then start the Bayesian optimization iterative process. We use 100 random initializations in the synthetic and real functions experiments, 40 in the comparisons to multi-step lookahead methods (replicating the experiment setup of Lam et al. [2016]), and 10 for comparisons of computation time.