reproducibilityindex.ai

Bayesian Optimization with Tree-structured Dependencies

Authors: Rodolphe Jenatton, Cedric Archambeau, Javier González, Matthias Seeger

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments on synthetic tree-structured objectives and on the tuning of feedforward neural networks show that our method compares favorably with competing approaches.
Researcher Affiliation	Industry	1Amazon, Berlin, Germany. 2Amazon, Cambridge, United Kingdom. Correspondence to: Rodolphe Jenatton <jenatton@amazon.de>, Cedric Archambeau <cedrica@amazon.de>, Javier Gonzalez <gojav@amazon.co.uk>, Matthias Seeger <matthias@amazon.de>.
Pseudocode	No	The paper describes procedures and mathematical models but does not contain a dedicated pseudocode or algorithm block.
Open Source Code	No	The paper states 'Our implementation is in Python' but does not provide an explicit statement about open-sourcing the code or a link to a repository for their specific methodology.
Open Datasets	Yes	To provide a robust evaluation of the different competing methods, we consider a subset of the datasets from the Libsvm repository (Chang & Lin, 2011).
Dataset Splits	No	The paper states 'In absence of pre-deﬁned default train-test split, we took a random 80% 20% split.', which only specifies train and test splits, without explicit mention of a separate validation set or cross-validation strategy.
Hardware Specification	Yes	Our implementation is in Python and we ran the experiments on a ﬂeet of Amazon AWS c4.8xlarge machines.
Software Dependencies	No	The paper mentions software like Python and scikit-learn, and refers to GPy Opt and SMAC implementations, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We optimize for the number of hidden layers in {0, 1, 2, 3, 4}, the number of units per layer in {1, 2, . . . , 30} (provided the corresponding layer is activated), the choice of the activation function in {identity, logistic, tanh, relu}, which we constrain to be identical across all layers, the amount of ℓ2 regularization in [10 6, 10 1], the learning rate in [10 5, 10 1] of the underlying Adam solver (Kingma & Ba, 2014), the tolerance in [10 5, 10 2] of the solver (based on relative decrease), and the type of data pre-processing, which can be unit ℓ2-norm observation-wise normalization, ℓ -norm feature-wise normalization, mean/standarddeviation feature-wise whitening or no normalization at all. ... we add a CPU-time constraint of 5 minutes to each evaluation, beyond which the worst classiﬁcation error 1.0 is returned.