reproducibilityindex.ai

Practical Bayesian Algorithm Execution via Posterior Sampling

Authors: Chu Xin Cheng, Raul Astudillo, Thomas A Desautels, Yisong Yue

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments across diverse tasks demonstrate that PS-BAX performs competitively with existing baselines while being significantly faster, simpler to implement, and easily parallelizable, setting a strong baseline for future research.
Researcher Affiliation	Academia	Chu Xin Cheng California Institute of Technology ccheng2@caltech.edu Raul Astudillo California Institute of Technology rastudil@caltech.edu Thomas Desautels Lawrence Livermore National Laboratory desautels2@llnl.gov Yisong Yue California Institute of Technology yyue@caltech.edu
Pseudocode	Yes	Algorithm 1 PS-BAX
Open Source Code	Yes	Code to reproduce our experiments is available at https://github.com/Raul Astudillo06/PSBAX.
Open Datasets	Yes	We evaluate the algorithms on a synthetic problem (the 2-dimensional Himmelblau function) and a real-world topographic dataset, consisting of 87 × 61 height measurements from a large geographic area around Auckland’s Maunga Whau volcano [44]... The first problem uses 3-dimensional Rosenbrock function, a standard benchmark in the optimization literature... The second problem is a real-world top-k (k = 10) selection task in protein design... Following [29], we use the tau protein assay [48] and interferon-gamma assay [49] datasets from the Achilles project [50].
Dataset Splits	No	The paper mentions that "an initial dataset is generated by sampling 2(d + 1) inputs uniformly at random from X" and "Each experiment was replicated 30 times", but it does not specify explicit training, validation, and test dataset splits like percentages or absolute sample counts for data partitioning.
Hardware Specification	No	The paper reports average runtimes per iteration in Table 1 but explicitly states in the NeurIPS checklist: "Unfortunately, the exact details of the computing resources used are not available." No specific GPU or CPU models are mentioned.
Software Dependencies	No	The paper mentions: "All our algorithms are implemented using Bo Torch [33]. Specifically, we use Bo Torch’s Single Task GP class..." However, it does not provide specific version numbers for Bo Torch or any other software dependencies.
Experiment Setup	Yes	In all experiments, an initial dataset is generated by sampling 2(d + 1) inputs uniformly at random from X... Unless stated otherwise, the batch size is set to q = 1... Each experiment was replicated 30 times, with plots showing mean performance plus and minus 1.96 standard errors... For Ackley, we set the batch size to q = 2... As a performance metric, we report the log10 inference regret... The threshold τ is set to the 0.55 quantile of all function values in the domain... The performance metric used is the F1 score... we use a deep kernel GP [47] as our probabilistic model... we perform batched evaluations with batch size of q = 4... Approximate samples from the posterior on f for both PS-BAX and INFO-BAX are generated using 1000 random Fourier features [52]. For INFO-BAX, we use L = 30 Monte Carlo samples to estimate the EIG across all problems.