Bayesian Optimization with Unknown Search Space

Authors: Huong Ha, Santu Rana, Sunil Gupta, Thanh Nguyen, Hung Tran-The, Svetha Venkatesh

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our method on both benchmark test functions and machine learning hyper-parameter tuning tasks and demonstrate that our method outperforms baselines. Our experimental results show that our method achieves better function values with fewer samples compared to state-of-the-art approaches.
Researcher Affiliation Academia Huong Ha, Santu Rana, Sunil Gupta, Thanh Nguyen, Hung Tran-The, Svetha Venkatesh Applied Artificial Intelligence Institute (A2I2) Deakin University, Geelong, Australia {huong.ha, santu.rana, sunil.gupta, thanhnt, hung.tranthe, svetha.venkatesh}@deakin.edu.au
Pseudocode Yes Algorithm 1 describes the proposed Bayesian optimization with unknown search space algorithm.
Open Source Code Yes Our source code is publicly available at https://github.com/Huong Ha12/BO_unknown_searchspace.
Open Datasets Yes Next we apply our method on hyperparameter tuning of three machine learning models on the MNIST dataset... We train the models with this hyperparameter setting using the MNIST train dataset (55000 patterns) and then test the model on the MNIST test dataset (10000 patterns).
Dataset Splits Yes We train the models with this hyperparameter setting using the MNIST train dataset (55000 patterns) and then test the model on the MNIST test dataset (10000 patterns). Bayesian optimization method then suggests a new hyperparameter setting based on the prediction accuracy on the test dataset.
Hardware Specification Yes All the time measurements were taken when evaluating the methods on a Ubuntu 18.04.2 server with Intel Xeon CPU E5-2670 2.60GHz 128GB RAM.
Software Dependencies No The paper states 'All the source codes are written in Python 3.6.' and mentions 'scikit-learn package' and 'tensorflow', but only Python is given with a version number. Key libraries like scikit-learn and tensorflow are mentioned without specific versions, which is insufficient for full reproducibility according to the criteria.
Experiment Setup Yes For all algorithms, the Squared Exponential kernel is used, the GP models are fitted using the Maximum Likelihood method and the output observations {yi} are normalized yi N(0, 1). As with previous GP-based algorithms that use confidence bounds [3, 19], our theoretical choice of {βt} in Theorem 5.1 is typically overly conservative. Hence, following the suggestion in [19], for any algorithms that use the GP-UCB acquisition, we scale βt down by a factor of 5. Finally, for the synthetic functions, ϵ is set at 0.05 whist for the machine learning models, ϵ is set at 0.02 as we require higher accuracy in these cases. The model is trained with the Adam optimizer in 20 epochs and the batch size is 128.