reproducibilityindex.ai

The Parallel Knowledge Gradient Method for Batch Bayesian Optimization

Authors: Jian Wu, Peter Frazier

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments on both synthetic functions and tuning practical machine learning algorithms, q-KG consistently ﬁnds better function values than other parallel BO algorithms, such as parallel EI [2, 19, 25], batch UCB [5] and parallel UCB with exploration [3]. q-KG provides especially large value when function evaluations are noisy.
Researcher Affiliation	Academia	Jian Wu, Peter I. Frazier Cornell University Ithaca, NY, 14853 {jw926, pf98}@cornell.edu
Pseudocode	Yes	Algorithm 1 The q-KG algorithm
Open Source Code	Yes	The code in this paper is available at https://github.com/wujian16/q KG.
Open Datasets	Yes	First, we tune logistic regression on the MNIST dataset... In the second experiment, we tune a CNN on CIFAR10 dataset.
Dataset Splits	Yes	We train logistic regression on a training set with 60000 instances with a given set of hyperparameters and test it on a test set with 10000 instances. ... We train the CNN on the 50000 training data with certain hyperparameters and test it on the test set with 10000 instances.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions software like 'C++', 'python interface', 'GP regression and GP hyperparameter fitting methods', 'Metrics Optimization Engine', 'Spearmint', and 'Gpoptimization', but it does not specify version numbers for these software components.
Experiment Setup	Yes	We set the batch size to q = 4. ... We initiate our algorithms by randomly sampling 2d + 2 points from a Latin hypercube design, where d is the dimension of the problem. ... We use a constant mean prior and the ARD Mat ern 5/2 kernel. ... We set M = 1000 to discretize the domain following the strategy in Section 5.3. ... We tune 4 hyperparameters: mini batch size from 10 to 2000, training iterations from 100 to 10000, the ℓ2 regularization parameter from 0 to 1, and learning rate from 0 to 1.