Multi-Fidelity Bayesian Optimization via Deep Neural Networks

Authors: Shibo Li, Wei Xing, Robert Kirby, Shandian Zhe

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show the advantages of our method in both synthetic benchmark datasets and real-world applications in engineering design.
Researcher Affiliation Academia Shibo Li School of Computing University of Utah Salt Lake City, UT 84112 shibo@cs.utah.edu Wei Xing Scientific Computing and Imaging Institute University of Utah Salt Lake City, UT 84112 wxing@sci.utah.edu Robert M. Kirby School of Computing University of Utah Salt Lake City, UT 84112 kirby@cs.utah.edu Shandian Zhe School of Computing University of Utah Salt Lake City, UT 84112 zhe@cs.utah.edu
Pseudocode Yes Algorithm 1 DNN-MFBO (D, M, T, {λm}M m=1 )
Open Source Code No The paper provides links to the implementations of competing methods (e.g., 'https://github.com/kirthevasank/ mf-gp-ucb') but does not state that the code for DNN-MFBO is open-source or provide a link to its repository.
Open Datasets Yes We first evaluated DNN-MFBO in three popular synthetic benchmark tasks. (1) Branin function (Forrester et al., 2008; Perdikaris et al., 2017)... (2) Park1 function (Park, 1991)... (3) Levy function (Laguna and Martí, 2005)...
Dataset Splits Yes To identify the architecture of the neural network in each fidelity and learning rate, we first ran the Auto ML tool SMAC3 (https://github.com/automl/SMAC3) on the initial training dataset (we randomly split the data into half for training and the other half for test, and repeated multiple times to obtain a cross-validation accuracy to guide the search) and then manually tuned these hyper-parameters.
Hardware Specification Yes For a fair comparison, we ran all the methods on a Linux workstation with a 16-core Intel(R) Xeon(R) CPU E5-2670 and 16GB RAM.
Software Dependencies No The paper mentions software like TensorFlow, Matlab, Python, and Numpy, but does not provide specific version numbers for these or other libraries.
Experiment Setup Yes The depth and width of each network were chosen from [2, 12] and [32, 512], and the learning rate [10 5, 10 1]. We used ADAM (Kingma and Ba, 2014) for stochastic training. The number of epochs was set to 5, 000, which is enough for convergence.