Multi-Information Source Optimization

Authors: Matthias Poloczek, Jialei Wang, Peter Frazier

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental evaluations demonstrate that this algorithm consistently finds designs of higher value at less cost than previous approaches.
Researcher Affiliation Collaboration Matthias Poloczek Department of Systems and Industrial Engineering University of Arizona Tucson, AZ 85721 poloczek@email.arizona.edu; Jialei Wang Chief Analytics Office IBM Armonk, NY 10504 jw865@cornell.edu; Peter I. Frazier School of Operations Research and Information Engineering Cornell University Ithaca, NY 14853 pf98@cornell.edu
Pseudocode Yes The Parallel Algorithm to compute (ℓ(n+1), x(n+1)): 1. Scatter the pairs (ℓ, x) [M]0 A among the machines. 2. Each computes MKGn(ℓ, x) for its pairs. To compute MKGn(ℓ, x) in parallel: a. Sort the points in A by ascending σn x (ℓ, x) in parallel, thereby removing dominated points. Let S be the sorted sequence. b. Split S into sequences S1, . . . , SC, where C is the number of cores used to compute MKGn(ℓ, x). Each core computes P xi SC(bi+1 bi)u( |di|) in parallel, then the partial sums are added to obtain En [maxi{ai + bi Z} maxi ai]. 3. Determine (ℓ(n+1), x(n+1)) argmaxℓ [M]0,x D MKGn(ℓ, x) in parallel.
Open Source Code Yes An implementation of our method is available at https://github.com/miso KG/.
Open Datasets Yes The goal is to optimize four hyperparameters of the logistic regression algorithm [36] using a stochastic gradient method with mini-batches (the learning rate, the L2-regularization parameter, the batch size, and the number of epochs) to minimize the classification error on the MNIST dataset [21]. IS 1 uses the USPS dataset [38] of about 9000 images with 256 pixels each.
Dataset Splits Yes In our experiments all methods were given identical initial datasets for each information source in every replication; these sets were drawn randomly via Latin Hypercube designs. For the sake of simplicity, we provided the same number of points for each IS, set to 2.5 points per dimension of the design space D.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications) used for running the experiments.
Software Dependencies No We implemented miso KG s statistical model and acquisition function in Python 2.7 and C++ leveraging functionality from the Metrics Optimization Engine [23]. Specific version numbers for C++ compiler, Python libraries, or the Metrics Optimization Engine are not provided.
Experiment Setup Yes Experimental Setup. We conduct experiments on the following test problems: (1) the 2dimensional Rosenbrock function modified to fit the MISO setting by Lam et al. [18]; (2) a MISO benchmark proposed by Swersky et al. [34] in which we optimize the 4 hyperparameters of a machine learning algorithm, using a small, related set of smaller images as cheap IS; (3) an assemble-to-order problem from Hong and Nelson [13] in which we optimize an 8-dimensional target stock vector to maximize the expected daily profit of a company as estimated by a simulator.