Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Multi-Fidelity Multi-Objective Bayesian Optimization: An Output Space Entropy Search Approach

Authors: Syrine Belakaria, Aryan Deshwal, Janardhan Rao Doppa10035-10043

AAAI 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on several synthetic and real-world benchmark problems show that MF-OSEMO, with both approximations, significantly improves over the state-of-the-art single-fidelity algorithms for multi-objective optimization.
Researcher Affiliation Academia Syrine Belakaria, Aryan Deshwal, Janardhan Rao Doppa School of EECS, Washington State University EMAIL
Pseudocode Yes Algorithm 1 MF-OESMO Algorithm
Open Source Code No The paper mentions employing code for baselines from the BO library Spearmint, providing a link to its GitHub repository (https://github.com/HIPS/Spearmint/tree/PESM). However, it does not state that the source code for the proposed MF-OSEMO method is open-source or publicly available.
Open Datasets Yes Synthetic benchmarks. We construct two synthetic benchmark problems using a combination of commonly employed benchmark functions for multi-fidelity and single-objective optimization 2, and two of the known general MO benchmarks (Habib, Singh, and Ray 2019). Their complete details are provided in Table 2. Footnote 2: https://www.sfu.ca/ ssurjano/optimization.html. We consider a design space of No C dataset consisting of 1024 implementation of a network-on-chip (Che et al. 2009).
Dataset Splits No The paper does not provide specific training/validation/test dataset splits with percentages, sample counts, or citations to predefined splits. It describes initialization of models and continuous evaluation on benchmark problems rather than a fixed split methodology for model evaluation.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as exact GPU/CPU models, processor types, or memory amounts.
Software Dependencies No The paper mentions using 'MF-GP models', 'squared exponential (SE) kernels', 'Spearmint' for baselines, and 'NSGA-II algorithm'. However, it does not provide specific version numbers for any of these software components or libraries.
Experiment Setup Yes The hyper-parameters are estimated after every 5 function evaluations. We initialize the MF-GP models for all functions by sampling initial points at random from a Sobol grid. We Initialise each of the lower fidelities with 5 points and the highest fidelity with only one point.