Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Portfolio Approach to Massively Parallel Bayesian Optimization
Authors: Mickael Binois, Nicholson Collier, Jonathan Ozik
JAIR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare the approach with related methods on noisy functions, for mono and multi-objective optimization tasks. These experiments show orders of magnitude speed improvements over existing methods with similar or better performance. |
| Researcher Affiliation | Academia | Mickaël Binois EMAIL Inria, Université Côte d Azur, CNRS, LJAD Sophia Antipolis, France; Nicholson Collier EMAIL Jonathan Ozik EMAIL Argonne National Laboratory, Lemont, IL, USA Consortium for Advanced Science and Engineering, University of Chicago Chicago, IL, USA |
| Pseudocode | Yes | Algorithm 1 Pseudo-code for batch BO; Algorithm 2 Pseudo-code for batch BO with q HSRI |
| Open Source Code | Yes | The R code (R Core Team, 2023) of the approach is available as supplementary material. |
| Open Datasets | Yes | We consider the training of a convolutional neural network (CNN) used for the classification of digits based on the MNIST data (Le Cun et al., 1998); The R code (R Core Team, 2023) of the approach and the City COVID data are available as supplementary material. |
| Dataset Splits | Yes | CNN used for the classification of digits based on the MNIST data (Le Cun et al., 1998), with 70,000 handwritten digits (including 10,000 for testing). |
| Hardware Specification | Yes | Results have been obtained in parallel on dual-Xeon Skylake SP Silver 4114 @ 2.20GHz (20 cores) and 192 GB RAM (or similar nodes). Lunar lander tests have been run on a laptop. The CNN training is performed on Ge Force GTX 1080 Ti GPUs. |
| Software Dependencies | Yes | The R package het GP (Binois & Gramacy, 2021) is used for noisy GP modeling. Dice Optim (Picheny et al., 2021), or the approximated one from Binois (2015), q AEI. pso (Bendtsen, 2012) (population of size 200) is conducted too. NSGA-II (Deb et al., 2002) from mco (Mersmann, 2020) is used to find P. The R package Dice Kriging (Roustant et al., 2012) is used for deterministic GP modeling. |
| Experiment Setup | Yes | The six variables of the CNN are given in Table 3. The architecture is composed of two 2D convolutional + max pooling layers, before two dense layers with dropout. The reference point used for hypervolume computations is [0, 150]. The CNN training is performed on Ge Force GTX 1080 Ti GPUs. The nine variables of the simulator are given in Table 4. |