reproducibilityindex.ai

Offline Contextual Bayesian Optimization

Authors: Ian Char, Youngseog Chung, Willie Neiswanger, Kirthevasan Kandasamy, Andrew Oakleigh Nelson, Mark Boyer, Egemen Kolemen, Jeff Schneider

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also demonstrate that if the model of the reward structure does a poor job of capturing variation in difﬁculty between tasks, then algorithms that actively pick tasks for evaluation may end up doing more harm than good. Following this, we show how our approach can be used for real world applications in science and engineering, including optimizing tokamak controls for nuclear fusion. 3.3 Synthetic Experiments 5 Application to Nuclear Fusion
Researcher Affiliation	Academia	Ian Char1, Youngseog Chung1, Willie Neiswanger1, Kirthevasan Kandasamy2, Andrew Oakleigh Nelson3, Mark D Boyer3, Egemen Kolemen3, and Jeff Schneider1 1Department of Machine Learning, Carnegie Mellon University {ichar, youngsec, willie, schneide}@cs.cmu.edu 2Department of EECS, University of California Berkeley kandasamy@eecs.berkeley.edu 3Princeton Plasma Physics Laboratory {anelson, mboyer, ekolemen}@pppl.gov
Pseudocode	Yes	Algorithm 1 Multi-Task Thompson Sampling (MTS)
Open Source Code	Yes	An implementation of our algorithm and synthetic experiments can be found at https://github.com/fusion-ml/OCBO.
Open Datasets	Yes	For the ﬁrst synthetic problem, we wish to optimize over 5 functions: four of which are concave parabaloids (with a range of [0, 1]) and the other being the Branin-Hoo function [Branin, 1972]. In particular we use the Branin-Hoo, Hartmann 4, and Hartmann 6 [Picheny et al., 2013] function to create Branin 1-1, Hartmann 2-2, Hartmann 3-1, and Hartmann 4-2
Dataset Splits	No	The paper states, 'We start by evaluating each task with 5 points drawn uniformly at random,' but does not specify explicit training, validation, or test dataset splits in terms of percentages or counts. The experimental setup is based on adaptive sampling rather than fixed data splits.
Hardware Specification	Yes	The simulation runs were performed on PPPL s Stellar machine, which contains several hundred nodes, with 2 Intel E5-2695 v3 2.30 GHz processors each.
Software Dependencies	No	The paper mentions using the 'Dragonfly library' and 'probabilistic programming and BO frameworks' but does not specify exact version numbers for these software dependencies, which would be necessary for full reproducibility.
Experiment Setup	Yes	We start by evaluating each task with 5 points drawn uniformly at random. Each task is modeled by a GP with an RBF kernel, and hyperparameters are tuned for a GP every time an observation is seen for its corresponding task. For two-dimensional functions, hyperparameters are tuned according to marginal likelihood, but for greater dimensions, tuning is done using a blend of marginal likelihood and posterior sampling. This method was found to be more robust by Kandasamy et al. [2019b]. Here, and throughout this section, we leverage the Dragonﬂy library for our experiments [Kandasamy et al., 2019b]. Lastly, in every experiment we let ω(x) = 1 for all x X and give noiseless feedback to the algorithms. 5 trials of each algorithms were run with each trial consisting of 200 evaluations, and in each trial, we allow up to 10 evaluations to be run in parallel.