Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A Portfolio Approach to Massively Parallel Bayesian Optimization

Authors: Mickael Binois, Nicholson Collier, Jonathan Ozik

JAIR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare the approach with related methods on noisy functions, for mono and multi-objective optimization tasks. These experiments show orders of magnitude speed improvements over existing methods with similar or better performance.
Researcher Affiliation	Academia	Mickaël Binois EMAIL Inria, Université Côte d Azur, CNRS, LJAD Sophia Antipolis, France; Nicholson Collier EMAIL Jonathan Ozik EMAIL Argonne National Laboratory, Lemont, IL, USA Consortium for Advanced Science and Engineering, University of Chicago Chicago, IL, USA
Pseudocode	Yes	Algorithm 1 Pseudo-code for batch BO; Algorithm 2 Pseudo-code for batch BO with q HSRI
Open Source Code	Yes	The R code (R Core Team, 2023) of the approach is available as supplementary material.
Open Datasets	Yes	We consider the training of a convolutional neural network (CNN) used for the classification of digits based on the MNIST data (Le Cun et al., 1998); The R code (R Core Team, 2023) of the approach and the City COVID data are available as supplementary material.
Dataset Splits	Yes	CNN used for the classification of digits based on the MNIST data (Le Cun et al., 1998), with 70,000 handwritten digits (including 10,000 for testing).
Hardware Specification	Yes	Results have been obtained in parallel on dual-Xeon Skylake SP Silver 4114 @ 2.20GHz (20 cores) and 192 GB RAM (or similar nodes). Lunar lander tests have been run on a laptop. The CNN training is performed on Ge Force GTX 1080 Ti GPUs.
Software Dependencies	Yes	The R package het GP (Binois & Gramacy, 2021) is used for noisy GP modeling. Dice Optim (Picheny et al., 2021), or the approximated one from Binois (2015), q AEI. pso (Bendtsen, 2012) (population of size 200) is conducted too. NSGA-II (Deb et al., 2002) from mco (Mersmann, 2020) is used to find P. The R package Dice Kriging (Roustant et al., 2012) is used for deterministic GP modeling.
Experiment Setup	Yes	The six variables of the CNN are given in Table 3. The architecture is composed of two 2D convolutional + max pooling layers, before two dense layers with dropout. The reference point used for hypervolume computations is [0, 150]. The CNN training is performed on Ge Force GTX 1080 Ti GPUs. The nine variables of the simulator are given in Table 4.