Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Bayesian Optimization for Probabilistic Programs
Authors: Tom Rainforth, Tuan Anh Le, Jan-Willem van de Meent, Michael A. Osborne, Frank Wood
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present applications of our method to a number of tasks including engineering design and parameter optimization. We first demonstrate the ability of BOPP to carry out unbounded optimization using a 1D problem with a significant prior-posterior mismatch as shown in Figure 4. Next we compare BOPP to the prominent BO packages SMAC [14], Spearmint [25] and TPE [3] on a number of classical benchmarks as shown in Figure 5. |
| Researcher Affiliation | Academia | Department of Engineering Science, University of Oxford College of Computer and Information Science, Northeastern University |
| Pseudocode | No | The paper includes a high-level algorithm overview in Figure 3 but does not provide formal pseudocode blocks or labeled algorithms. |
| Open Source Code | Yes | Code available at http://www.github.com/probprog/bopp/ Code available at http://www.github.com/probprog/deodorant/ |
| Open Datasets | No | The paper uses benchmark problems and simulations (e.g., 'Energy2D simulations', 'Hartmann 6D', 'SVM on-grid', 'LDA on-grid', 'pickover attractor') rather than traditional public datasets with explicit access information. |
| Dataset Splits | No | The paper evaluates on benchmark functions and simulations, not traditional datasets with specified train/validation/test splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like Anglican, Energy2D, Stan, Church, Venture, Web PPL, but does not provide specific version numbers for these or other software dependencies used in their experimental setup. |
| Experiment Setup | Yes | BOPP therefore employs an affine scaling to a [ 1, 1] hypercube for both the inputs and outputs of the GP. We use as a default covariance function a combination of a Mat ern3/2 and Mat ern-5/2 kernel. Inference over hyperparameters is performed using Hamiltonian Monte Carlo (HMC) [6]. r is a parameter set to 1.5re by default. ...using a variant of annealed importance sampling [19] in which lightweight Metropolis Hastings (LMH) [28] with local random-walk moves is used as the base transition kernel. |