Bayesian Optimization Meets Bayesian Optimal Stopping

Authors: Zhongxiang Dai, Haibin Yu, Bryan Kian Hsiang Low, Patrick Jaillet

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate the performance of BO-BOS and demonstrate its generality in hyperparameter optimization of ML models and two other interesting applications.
Researcher Affiliation Academia 1Department of Computer Science, National University of Singapore, Republic of Singapore. 2Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, USA.
Pseudocode Yes Algorithm 1 BO-BOS
Open Source Code No The paper does not include any statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets Yes We first tune three hyperparameters of LR trained on the MNIST image dataset. We tune six hyperparameters of CNN using two image datasets: CIFAR-10 (Krizhevsky, 2009) and Street View House Numbers (SVHN) (Netzer et al., 2011). We apply our algorithm to the Swimmer-v2 task from Open AI Gym, Mu Jo Co (Brockman et al., 2016; Todorov et al., 2012). We additionally compare with Hyperband since it was previously applied to random feature approximation in kernel methods (Li et al., 2017). In this task, we tune four hyperparameters of the gradient boosting model (XGBoost (Chen & Guestrin, 2016)) trained on an email spam dataset.
Dataset Splits No The paper mentions “validation accuracy” and “validation error” frequently, but it does not specify the exact percentages or counts for training, validation, or test splits for any of the datasets used in its experiments.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions using tools like Open AI Gym, Mu Jo Co, and XGBoost, and refers to implementations in previous works, but it does not specify version numbers for any of these software dependencies.
Experiment Setup Yes For simplicity, we fix = 2 and investigate the impact of different sequences of K1,t values. The sequence (K1,t)b, as well as the values of K2 = 99 and cd0 = 1, will be used in the following experiments if not further specified. Our algorithm to the Swimmer-v2 task from Open AI Gym, Mu Jo Co (Brockman et al., 2016; Todorov et al., 2012), and use a linear policy consisting of 16 parameters. Each episode consists of 1000 steps, and we treat every m consecutive steps as one single epoch such that N = 1000/m. Direct application of BO-BOS in this task is inappropriate since the growth pattern of cumulative rewards differs significantly from the evolution of the learning curves of ML models (Appendix D.3). Therefore, the rewards are discounted (by γ) when calculating the objective function, because the pattern of discounted return (cumulative rewards) bears close resemblance to that of learning curves. Note that although the value of the objective function is the discounted return, we also record and report the corresponding un-discounted return, which is the ultimate objective to be maximized. As a result, N and γ should be chosen such that the value of discounted return faithfully aligns with its un-discounted counterpart. Fig. 3 plots the best (un-discounted) return in an episode against the total number of steps, in which BO-BOS (with N = 50 and γ = 0.9) outperforms GP-UCB (for both γ = 0.9 and γ = 1.0). The K1,t sequence is made smaller than before: K1,t = K1,t 1/0.99, because in this setting, more aggressive early stopping is needed for BO-BOS to show its advantage.