Stopping Bayesian Optimization with Probabilistic Regret Bounds
Authors: James Wilson
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | These findings are accompanied by empirical results which demonstrate the strengths and weaknesses of the proposed approach. ... Finally, Section 5 investigates its empirical performance under idealized and realistic circumstances. |
| Researcher Affiliation | Industry | James T. Wilson Morgan Stanley, New York, USA james.t.wilson@morganstanley.com |
| Pseudocode | Yes | Algorithm 1 BO with Monte Carlo PRB... Algorithm 2 Monte Carlo PRB |
| Open Source Code | Yes | code is available online at https://github.com/j-wilson/trieste_stopping. |
| Open Datasets | Yes | Experiments were performed by first running BO with conservatively chosen budgets T N. We then stepped through each saved run with different stopping rules to establish stopping times and terminal performance. This paradigm ensured fair comparisons and reduced compute overheads. We performed a hundred independent BO runs for all problems other than hyperparameter tuning for convolutional neural networks (CNNs) on MNIST [14], where only fifty runs were carried out. ... income prediction [7] |
| Dataset Splits | No | The paper mentions generating "train-test splits" for certain problems but does not specify exact percentages or sample counts for training, validation, or test sets. |
| Hardware Specification | Yes | Runtimes reported in Figure 3 were measured on an Apple M1 Pro Chip using an off-the-shelf build of Tensor Flow [1]. |
| Software Dependencies | No | The paper mentions software like GPFlow, Trieste, and TensorFlow, but does not specify their version numbers in the text. For example, in C.1: "We employed Gaussian process priors f GP(µ, k) ... using an off-the-shelf build of Tensor Flow [1]." |
| Experiment Setup | Yes | Each BO run was tasked with finding an ϵ-optimal point with probability at least 1 δ = 95%. On the Rosenbrock-4 fine-tuning problem, we used a regret bound ϵ = 10 4. For CNNs, we aimed to be within ϵ = 0.5% of the best test error (i.e., misclassification rate) seen across all runs, namely 0.62%. Likewise, when fitting XGBoost classifiers [12] for income prediction [7], we sought to be within 1% of the best found test error of 12.89%. For all other problems, we set ϵ = 0.1. ... Each model was trained using Adam [24], with batches of size 64. |