reproducibilityindex.ai

Scalable Hyperparameter Transfer Learning

Authors: Valerio Perrone, Rodolphe Jenatton, Matthias W. Seeger, Cedric Archambeau

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that the neural net learns a representation suitable for warm-starting the black-box optimization problems and that BO runs can be accelerated when the target black-box function (e.g., validation loss) is learned together with other related signals (e.g., training loss). The proposed method was found to be at least one order of magnitude faster than competing methods recently published in the literature. Section 4 presents experiments on simulated and real data, reporting favorable comparisons with existing alternatives when leveraging data across auxiliary tasks and signals.
Researcher Affiliation	Industry	Valerio Perrone, Rodolphe Jenatton, Matthias Seeger, Cédric Archambeau Amazon Berlin, Germany {vperrone, jenatton, matthis, cedrica}@amazon.com
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. It describes the BO loop in natural language.
Open Source Code	No	The paper mentions using publicly available code for other methods (DNGO, BOHAMIANN) at a given URL, but does not provide a link or explicit statement about releasing the source code for their own ABLR methodology.
Open Datasets	Yes	In Sections 4.2 and 4.3, we evaluate its potential to transfer information between tasks deﬁned by, respectively, synthetic data and Open ML data [32]. In Section 4.4, we investigate the transfer learning ability of ABLR in presence of multiple heterogeneous signals. Open ML data [32], LIBSVM [45]. [32] Joaquin Vanschoren, Jan N Van Rijn, Bernd Bischl, and Luis Torgo. Open ML: Networked science in machine learning. ACM SIGKDD Explorations Newsletter, 15(2):49 60, 2014. [45] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1 27:27, 2011.
Dataset Splits	No	The paper mentions using a 'leave-one-task-out' protocol and tuning parameters based on 'validation error', but it does not specify explicit train/validation/test dataset splits (e.g., percentages or sample counts) for reproducibility.
Hardware Specification	Yes	All our measurements are made on a c4.2xlarge AWS machine.
Software Dependencies	No	The paper mentions using GPy Opt [33] and MXNet [12], and Adam [40] for optimization, but does not provide specific version numbers for these software components.
Experiment Setup	Yes	The NN that learns the feature map φz(x) is similar to the one used in [18]. It has three fully connected layers, each with 50 units and tanh activation function. The dimension D = 100 was picked after we investigated the computation time of ABLR-based HPO with learned NN features (D = 50) and with RKS features (D {50, 100, 200}). All BO experiments use the expected improvement acquisition function [2]. The feedforward NN was trained for 200 iterations, each time on a batch of 200 samples.