Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimizing Dynamic Structures with Bayesian Generative Search

Authors: Minh Hoang, Carleton Kingsford

ICML 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section evaluates and reports the empirical performance of our kernel selection framework DTERGENS on a synthetic kernel recovery task and kernel selection for regression on three real-world datasets.
Researcher Affiliation Academia 1Computer Science Department, School of Computer Science, Carnegie Mellon University, USA 2Computational Biology Department, School of Computer Science, Carnegie Mellon University, USA.
Pseudocode Yes Algorithm 1 DTERGENS Kernel Selection
Open Source Code No The paper does not provide explicit statements or links for open-source code specific to the methodology described.
Open Datasets Yes The DIABETES dataset (Efron et al., 2004) containing 442 diabetes patient records... The MAUNA LOA (Mauna Loa Atmospheric Carbon Dioxide) dataset (Keeling & Whorf, 2004)... The PROTEIN dataset (Rana, 2013) featuring 45730 observations of protein tertiary structures...
Dataset Splits Yes 80-10-10 train-test-validation split (i.e., we use the validation fold to compute BO feedback and the test fold to evaluate final performance);
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running experiments.
Software Dependencies No The paper does not specify version numbers for any software dependencies (e.g., libraries, frameworks, or languages).
Experiment Setup Yes For all experiments, we demonstrate the performance of our framework on the black-box model Variational DTC Sparse Gaussian Process (v DTC) (Hensman et al., 2013) with the following configurations: (1) 80-10-10 train-test-validation split (i.e., we use the validation fold to compute BO feedback and the test fold to evaluate final performance); (2) 100 randomly selected inducing inputs; (3) kernel hyperparameters are optimized using L-BFGS over 100 iterations.