Predictive Entropy Search for Multi-objective Bayesian Optimization

Authors: Daniel Hernandez-Lobato, Jose Hernandez-Lobato, Amar Shah, Ryan Adams

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare PESMO with other methods on synthetic and real-world problems. The results show that PESMO produces better recommendations with a smaller number of evaluations, and that a decoupled evaluation can lead to improvements in performance, particularly when the number of objectives is large.
Researcher Affiliation Collaboration Daniel Hern andez-Lobato DANIEL.HERNANDEZ@UAM.ES Universidad Aut onoma de Madrid, Francisco Tom as y Valiente 11, 28049, Madrid, Spain. Jos e Miguel Hern andez-Lobato JMHL@SEAS.HARVARD.EDU Harvard University, 33 Oxford street, Cambridge, MA 02138, USA. Amar Shah AS793@CAM.AC.UK Cambridge University, Trumpington Street, Cambridge CB2 1PZ, United Kingdom. Ryan P. Adams RPA@SEAS.HARVARD.EDU Harvard University and Twitter, 33 Oxford street Cambridge, MA 02138, USA.
Pseudocode No The paper describes the approach using mathematical equations and textual explanations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No We have coded all these methods in the software for Bayesian optimization Spearmint (https://github.com/HIPS/ Spearmint).
Open Datasets Yes We consider the MNIST dataset (Le Cun et al., 1998)
Dataset Splits Yes The prediction error is measured on a set of 10,000 instances extracted from the training set. The rest of the training data, i.e., 50,000 instances, is used for training.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Keras library', 'Adam (D. Kingma, 2014)', and 'open-BLAS library' but does not specify version numbers for these software components.
Experiment Setup Yes The adjustable parameters are: The number of hidden units per layer (between 50 and 300), the number of layers (between 1 and 3), the learning rate, the amount of dropout, and the level of ℓ1 and ℓ2 regularization.