Active Learning for Derivative-Based Global Sensitivity Analysis with Gaussian Processes

Authors: Syrine Belakaria, Ben Letham, Jana Doppa, Barbara Engelhardt, Stefano Ermon, Eytan Bakshy

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through comprehensive evaluation on synthetic and real-world problems, our study demonstrates how these active learning acquisition strategies substantially enhance the sample efficiency of DGSM estimation, particularly with limited evaluation budgets.
Researcher Affiliation Collaboration Syrine Belakaria Stanford University Benjamin Letham Meta Janardhan Rao Doppa Washington State University Barbara Engelhardt Stanford University Stefano Ermon Stanford University Eytan Bakshy Meta
Pseudocode Yes Algorithm 1 Bayesian Active Learning Input: X, f(x), surrogate model GP, utility function α(x, GP), total budget T. Output: DT , GP.
Open Source Code Yes The implementation for our methods, the baselines, and the synthetic and real-world problems is available in our code (https://github.com/belakaria/AL-GSA-DGSMs).
Open Datasets Yes For synthetic experiments we used a family of functions designed specifically for evaluating sensitivity analysis measures [18, 17]: Ishigami1 (d = 3), Ishigami2 (d = 3), Gsobol6 (d = 6), a-function (d = 6), Gsobol10 (d = 10), Gsobol15 (d = 15) and Morris (d = 20). Ground-truth DGSMs are available for these problems. We additionally used other general-purpose synthetic functions where sensitivity might be challenging to estimate [8]: Branin (d = 2), Hartmann3 (d = 3) and Hartmann4 (d = 4). For these functions, we numerically estimated ground-truth DGSMs. We considered three real-world design problems. The Car Side Impact Weight problem... We also used the Vehicle Safety problem...
Dataset Splits No We study settings with limited evaluation budgets. Quasirandom sequences are known to perform well given enough data [6, 3]. Here, we focus on the restrictive case where we initialize our experiments using five random inputs and run 30 iterations of active learning. Our results are averaged over 50 replicates from different initial points, and we report the mean and two standard errors over replicates.
Hardware Specification No For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [No] . Justification: We ran our experiments on different types of machines for the sake of parallelization. However, we provided a detailed time complexity discussion in Appendix C and wall-clock running times in Appendix E.2.
Software Dependencies No All acquisition functions were implemented in Bo Torch [2] and were designed to be auto-differentiable and efficiently optimized with gradient optimization. Does the paper provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment? Answer: [No] Justification: The paper mentions BoTorch but not specific versions of Python, PyTorch, or CUDA etc. only BoTorch.
Experiment Setup Yes We study settings with limited evaluation budgets. Quasirandom sequences are known to perform well given enough data [6, 3]. Here, we focus on the restrictive case where we initialize our experiments using five random inputs and run 30 iterations of active learning. Our results are averaged over 50 replicates from different initial points, and we report the mean and two standard errors over replicates. Our primary evaluation metric is root mean squared error (RMSE) of the DGSM estimate versus ground truth.