MALIBO: Meta-learning for Likelihood-free Bayesian Optimization

Authors: Jiarong Pan, Stefan Falkner, Felix Berkenkamp, Joaquin Vanschoren

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our method achieves strong performance and outperforms multiple meta-learning BO methods across various benchmarks. 5. Experiments In this section, we first show the effects of using Thompson sampling and gradient boosting through a preliminary ablation study. Subsequently, we describe the experiments conducted to empirically evaluate our method.
Researcher Affiliation Collaboration 1Bosch Center for Artificial Intelligence, Germany 2Eindhoven University of Technology, Netherlands.
Pseudocode Yes Algorithm 1 MALIBO: Meta-learning for likelihood-free Bayesian optimization
Open Source Code Yes 1Our code is available in the following repository: https://github.com/boschresearch/meta-learning-likelihood-free-bayesian-optimization
Open Datasets Yes We empirically evaluate our method on various real-world optimization tasks, focusing on Auto ML problems, including neural architecture search (NASBench201) (Dong & Yang, 2020), hyperparameter optimization for neural networks (HPOBench) (Klein & Hutter, 2019) and machine learning algorithms (HPO-B) (Pineda-Arango et al., 2021).
Dataset Splits Yes To train and evaluate the meta-learning BO methods in HPOBench and NASBench201, we conduct our experiments in a leave-one-task-out way: all meta-learning methods use one task as the target task and all others as related tasks. ... For HPO-B, we utilize the provided meta-train and meta-validation dataset to train the meta-learning methods and evaluate all methods on the meta-test data.
Hardware Specification Yes We ran all baselines on 4 CPUs (Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz) except for Meta BO, which requires more computation and we explain the details down below. ... Meta BO required almost 2 hours, using one NVIDIA Titan X GPU and 10 Intel(R) Xeon(R) CPU E5-2697 v3 CPUs.
Software Dependencies No The paper mentions various software components and libraries used (e.g., BoTorch, scikit-learn, ADAM optimizer), but does not provide specific version numbers for these dependencies, which is required for reproducibility.
Experiment Setup Yes During meta-training, we optimize the parameters in the network with the ADAM optimizer (Kingma & Ba, 2015), with learning rate lr = 10^-3 and batch size of B = 256. In addition, we apply exponential decay to the learning rate in each epoch with factor of 0.999. The model is trained for 2,048 epochs with early stopping. For the regularization loss, we set the regularization factor λ = 0.1 in Equation (3)...