Practical Bayesian Algorithm Execution via Posterior Sampling
Authors: Chu Xin Cheng, Raul Astudillo, Thomas A Desautels, Yisong Yue
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments across diverse tasks demonstrate that PS-BAX performs competitively with existing baselines while being significantly faster, simpler to implement, and easily parallelizable, setting a strong baseline for future research. |
| Researcher Affiliation | Academia | Chu Xin Cheng California Institute of Technology ccheng2@caltech.edu Raul Astudillo California Institute of Technology rastudil@caltech.edu Thomas Desautels Lawrence Livermore National Laboratory desautels2@llnl.gov Yisong Yue California Institute of Technology yyue@caltech.edu |
| Pseudocode | Yes | Algorithm 1 PS-BAX |
| Open Source Code | Yes | Code to reproduce our experiments is available at https://github.com/Raul Astudillo06/PSBAX. |
| Open Datasets | Yes | We evaluate the algorithms on a synthetic problem (the 2-dimensional Himmelblau function) and a real-world topographic dataset, consisting of 87 × 61 height measurements from a large geographic area around Auckland’s Maunga Whau volcano [44]... The first problem uses 3-dimensional Rosenbrock function, a standard benchmark in the optimization literature... The second problem is a real-world top-k (k = 10) selection task in protein design... Following [29], we use the tau protein assay [48] and interferon-gamma assay [49] datasets from the Achilles project [50]. |
| Dataset Splits | No | The paper mentions that "an initial dataset is generated by sampling 2(d + 1) inputs uniformly at random from X" and "Each experiment was replicated 30 times", but it does not specify explicit training, validation, and test dataset splits like percentages or absolute sample counts for data partitioning. |
| Hardware Specification | No | The paper reports average runtimes per iteration in Table 1 but explicitly states in the NeurIPS checklist: "Unfortunately, the exact details of the computing resources used are not available." No specific GPU or CPU models are mentioned. |
| Software Dependencies | No | The paper mentions: "All our algorithms are implemented using Bo Torch [33]. Specifically, we use Bo Torch’s Single Task GP class..." However, it does not provide specific version numbers for Bo Torch or any other software dependencies. |
| Experiment Setup | Yes | In all experiments, an initial dataset is generated by sampling 2(d + 1) inputs uniformly at random from X... Unless stated otherwise, the batch size is set to q = 1... Each experiment was replicated 30 times, with plots showing mean performance plus and minus 1.96 standard errors... For Ackley, we set the batch size to q = 2... As a performance metric, we report the log10 inference regret... The threshold τ is set to the 0.55 quantile of all function values in the domain... The performance metric used is the F1 score... we use a deep kernel GP [47] as our probabilistic model... we perform batched evaluations with batch size of q = 4... Approximate samples from the posterior on f for both PS-BAX and INFO-BAX are generated using 1000 random Fourier features [52]. For INFO-BAX, we use L = 30 Monte Carlo samples to estimate the EIG across all problems. |