Global Optimization with Parametric Function Approximation

Authors: Chong Liu, Yu-Xiang Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Synthetic and real-world experiments illustrate GO-UCB works better than popular Bayesian optimization approaches, even if the model is misspecified.
Researcher Affiliation Academia 1Department of Computer Science, University of California, Santa Barbara, CA 93106, USA.
Pseudocode Yes Algorithm 1 GO-UCB
Open Source Code No All implementations are based on BoTorch framework (Balandat et al., 2020) and sklearn package (Head et al., 2021) with default parameter settings.
Open Datasets Yes Three UCI datasets (Dua & Graff, 2017) are Breat-cancer, Australian, and Diabetes
Dataset Splits Yes To reduce the effect of randomness, we divide each dataset into 5 folds and every time use 4 folds for training and remaining 1 fold for testing.
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No All implementations are based on BoTorch framework (Balandat et al., 2020) and sklearn package (Head et al., 2021) with default parameter settings.
Experiment Setup Yes To run GO-UCB, we choose our parametric function model ˆf to be a two linear layer neural network with sigmoid function being the activation function: ˆf(x) = linear2(sigmoid(linear1(x))), where w1, b1 denote the weight and bias of linear1 layer and w2, b2 denote those of linear2 layer. Specifically, we set w1 R25 dx, b1 R25, w2 R25, b2 R, meaning the dimension of activation function is 25. ... Noise parameter σ = 0.01. Regression oracle in GO-UCB is approximated by stochastic gradient descent algorithm on our two linear layer neural network model with mean squared error loss, 2000 iterations and 10−11 learning rate. ... we use iterative gradient ascent algorithm over x and w with 2000 iterations and 10−4 learning rate. ... We set n = 5, T = 25 for f1 and n = 8, T = 64 for f2, f3.