Multi-fidelity Bayesian Optimisation with Continuous Approximations

Authors: Kirthevasan Kandasamy, Gautam Dasarathy, Jeff Schneider, Barnabás Póczos

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental An empirical study which demonstrates that BOCA outperforms alternatives, both multi-fidelity and otherwise, on a series of synthetic problems and real examples in hyper-parameter tuning and astrophysics. and 4. Experiments We compare BOCA to the following four baselines: (i) GP-UCB, (ii) the GP-EI criterion in BO (Jones et al., 1998), (iii) MF-GP-UCB (Kandasamy et al., 2016a) and (iv) MF-SKO, the multi-fidelity sequential kriging optimisation method from Huang et al., 2006).
Researcher Affiliation Academia 1Carnegie Mellon University, Pittsburgh PA, USA 2Rice University, Houston TX, USA.
Pseudocode Yes Algorithm 1 BOCA Input: kernel κ. Set ν0( ) 0, τ0( ) κ( , )1/2, D0 . for t = 1, 2, . . .
Open Source Code No The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets Yes We use data on Type Ia supernova for maximum likelihood inference on 3 cosmological parameters... We use the dataset from Davis et al (2007) which has data on 192 supernovae. and We use the 20 news groups dataset (Joachims, 1996) in a text classification task.
Dataset Splits Yes Each evaluation takes the given dataset of size N and splits it up into 5 to perform 5-fold cross validation.
Hardware Specification No The paper does not provide specific details regarding the hardware used for running the experiments, such as CPU/GPU models or memory specifications.
Software Dependencies No The paper mentions 'scikit-learn' but does not provide a version number or list other software dependencies with specific versions.
Experiment Setup Yes As the kernel and other GP hyper-parameters are unknown, we learn them by maximising the marginal likelihood every 25 iterations. and All methods are based on GPs and we use the SE kernel for both the fidelity space and domain. and To reflect the setting in our theory, we add Gaussian noise to the function value when observing g at any (z, x). and For the first two figures we used capital 30λ(z ), therefore a method which only queries at g(z , ) can make at most 30 evaluations.