High Dimensional Bayesian Optimization using Dropout

Authors: Cheng Li, Sunil Gupta, Santu Rana, Vu Nguyen, Svetha Venkatesh, Alistair Shilton

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the efficacy of our algorithms for optimization on two benchmark functions and two realworld applications training cascade classifiers and optimizing alloy composition. The experimental results demonstrate the effectiveness of our algorithms. Our experimental results on synthetic and real applications show that our methods works effectively for the high-dimensional optimization.
Researcher Affiliation Academia Cheng Li, Sunil Gupta, Santu Rana, Vu Nguyen, Svetha Venkatesh, Alistair Shilton Centre for Pattern Recognition and Data Analytics (PRa DA), Deakin University, Australia cheng.l@deakin.edu.au
Pseudocode Yes Algorithm 1 Dropout Algorithm for High-dimensional Bayesian Optimization
Open Source Code No The paper does not provide any links to open-source code or explicit statements about code availability.
Open Datasets Yes We evaluate the dropout algorithm by training a cascade classifier [Viola and Jones, 2001] on three real datasets from UCI repository: IJCNN1, German and Ionosphere dataset.
Dataset Splits No The paper mentions using initial observations and running algorithms multiple times with different initializations, but it does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts for each split) for reproducibility.
Hardware Specification No The paper does not specify any hardware details like CPU, GPU models, or memory used for running the experiments.
Software Dependencies No The paper mentions using "Gaussian process (GP)", "SE kernel", and "DIRECT [Jones et al., 1993]", but it does not specify any software names with version numbers for reproducibility (e.g., Python, PyTorch, specific GP libraries).
Experiment Setup Yes For standard BO we allocate a budget of 30 seconds... The number of initial observations are set at d + 1. We use the SE kernel with the lengthscale 0.1 and DIRECT [Jones et al., 1993] to optimize acquisition functions. We experiment d = 1, 2, 5, 10 for D = 20 in Dropout-Copy. We set p = 0, 0.1, 0.5, 0.8, 1. We test our algorithms with d = 2 for D = 5 and d = 5 for D = 10, 20, 30. Dropout-mix is applied with p = 0.1. We run 500 function evaluations for these two functions. The number of stages is set equal to the number of features in the dataset. We use d = 5 and p = 0.1 for all datasets.