reproducibilityindex.ai

Bayesian Optimization with Gradients

Authors: Jian Wu, Matthias Poloczek, Andrew G. Wilson, Peter Frazier

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In numerical experiments we compare with state-of-the-art batch Bayesian optimization algorithms with and without derivative information, and the gradient-based optimizer BFGS with full gradients.
Researcher Affiliation	Academia	1 Cornell University, 2 University of Arizona
Pseudocode	Yes	Algorithm 1 d-KG with Relevant Directional Derivative Detection
Open Source Code	Yes	The code for this paper is available at https://github.com/wujian16/Cornell-MOE.
Open Datasets	Yes	We use the yellow cab NYC public data set from June 2016, sampling 10000 records from June 1 25 as training data and 1000 trip records from June 26 30 as validation data. ... We tune logistic regression and a feedforward neural network with 2 hidden layers on the MNIST dataset [20], a standard classiﬁcation task for handwritten digits.
Dataset Splits	Yes	We use the yellow cab NYC public data set from June 2016, sampling 10000 records from June 1 25 as training data and 1000 trip records from June 26 30 as validation data. ... The training set contains 60000 images, the test set 10000.
Hardware Specification	No	The paper discusses computational complexity and scaling (e.g., GP inference scales as O(n3(d + 1)3)), but it does not provide specific hardware details such as GPU/CPU models or memory used for experiments.
Software Dependencies	No	The paper mentions using the 'emcee package' and 'scipy' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	We choose m in [30, 200], l2 1 in [101, 108], and l2 2, l2 3, l2 4, l2 5 each in [10 8, 10 1]. ... We tune 4 hyperparameters for logistic regression: the ℓ2 regularization parameter from 0 to 1, learning rate from 0 to 1, mini batch size from 20 to 2000 and training epochs from 5 to 50. ... We also experiment with two different batch sizes: we use a batch size q = 4 for the Branin, Rosenbrock, and Ackley functions; otherwise, we use a batch size q = 8.