reproducibilityindex.ai

Active Learning for Derivative-Based Global Sensitivity Analysis with Gaussian Processes

Authors: Syrine Belakaria, Ben Letham, Jana Doppa, Barbara Engelhardt, Stefano Ermon, Eytan Bakshy

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through comprehensive evaluation on synthetic and real-world problems, our study demonstrates how these active learning acquisition strategies substantially enhance the sample efﬁciency of DGSM estimation, particularly with limited evaluation budgets.
Researcher Affiliation	Collaboration	Syrine Belakaria Stanford University Benjamin Letham Meta Janardhan Rao Doppa Washington State University Barbara Engelhardt Stanford University Stefano Ermon Stanford University Eytan Bakshy Meta
Pseudocode	Yes	Algorithm 1 Bayesian Active Learning Input: X, f(x), surrogate model GP, utility function α(x, GP), total budget T. Output: DT , GP.
Open Source Code	Yes	The implementation for our methods, the baselines, and the synthetic and real-world problems is available in our code (https://github.com/belakaria/AL-GSA-DGSMs).
Open Datasets	Yes	For synthetic experiments we used a family of functions designed speciﬁcally for evaluating sensitivity analysis measures [18, 17]: Ishigami1 (d = 3), Ishigami2 (d = 3), Gsobol6 (d = 6), a-function (d = 6), Gsobol10 (d = 10), Gsobol15 (d = 15) and Morris (d = 20). Ground-truth DGSMs are available for these problems. We additionally used other general-purpose synthetic functions where sensitivity might be challenging to estimate [8]: Branin (d = 2), Hartmann3 (d = 3) and Hartmann4 (d = 4). For these functions, we numerically estimated ground-truth DGSMs. We considered three real-world design problems. The Car Side Impact Weight problem... We also used the Vehicle Safety problem...
Dataset Splits	No	We study settings with limited evaluation budgets. Quasirandom sequences are known to perform well given enough data [6, 3]. Here, we focus on the restrictive case where we initialize our experiments using ﬁve random inputs and run 30 iterations of active learning. Our results are averaged over 50 replicates from different initial points, and we report the mean and two standard errors over replicates.
Hardware Specification	No	For each experiment, does the paper provide sufﬁcient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [No] . Justiﬁcation: We ran our experiments on different types of machines for the sake of parallelization. However, we provided a detailed time complexity discussion in Appendix C and wall-clock running times in Appendix E.2.
Software Dependencies	No	All acquisition functions were implemented in Bo Torch [2] and were designed to be auto-differentiable and efﬁciently optimized with gradient optimization. Does the paper provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment? Answer: [No] Justification: The paper mentions BoTorch but not specific versions of Python, PyTorch, or CUDA etc. only BoTorch.
Experiment Setup	Yes	We study settings with limited evaluation budgets. Quasirandom sequences are known to perform well given enough data [6, 3]. Here, we focus on the restrictive case where we initialize our experiments using ﬁve random inputs and run 30 iterations of active learning. Our results are averaged over 50 replicates from different initial points, and we report the mean and two standard errors over replicates. Our primary evaluation metric is root mean squared error (RMSE) of the DGSM estimate versus ground truth.