Single Model Uncertainty Estimation via Stochastic Data Centering
Authors: Jayaraman Thiagarajan, Rushil Anirudh, Vivek Sivaraman Narayanaswamy, Timo Bremer
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our approach in this section using a variety of applications and benchmarks (a) first, we consider the utility of epistemic uncertainties in object recognition problems where they have been successfully used for outlier rejection and calibrating models under distribution shifts. We show that UQ can be very effective even with large-scale datasets like Image Net [22]; (b) next, we consider the challenging problem of sequential design optimization of black-box functions, where the goal is to maximize a scalar function of interest with the fewest number of sample evaluations. Using a Bayesian optimization setup, we show that the uncertainties obtained using UQ outperform many competitive methods across an extensive suite of black-box functions. |
| Researcher Affiliation | Academia | Jayaraman J. Thiagarajan Lawrence Livermore National Laboratory jjayaram@llnl.gov; Rushil Anirudh Lawrence Livermore National Laboratory anirudh1@llnl.gov; Vivek Narayanaswamy Arizona State University vnaray29@asu.edu; Peer-Timo Bremer Lawrence Livermore National Laboratory bremer5@llnl.gov |
| Pseudocode | Yes | Figure 2: Mini-batch training with UQ . for inputs , targets in trainloader: A = Shuffle(inputs) %% Anchors D = inputs A %% Delta X_d = torch.cat([A, D],axis=1) y_d = model(X_d) %% prediction loss = criterion(y_d,targets) |
| Open Source Code | Yes | Code for UQ can be accessed at github.com/LLNL/Delta UQ |
| Open Datasets | Yes | We show that UQ can be very effective even with large-scale datasets like Image Net [22]; We also perform an experiment with a pre-trained generative model (GAN) trained on MNIST handwritten digits |
| Dataset Splits | Yes | Let us examine the scenario where we shift an entire dataset (both train and validation) using a constant bias, c, to obtain a new dataset Dc; We train a modified Res Net-50 [23] model on Image Net that accepts 6 input channels (anchor, ) as outlined earlier. We train the model using standard hyperparameter settings except, training it longer for 120 epochs; We use the clean Image Net validation data as inliers. |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU models, CPU types, or memory amounts used for the experiments. |
| Software Dependencies | No | The paper mentions "Py Torch snippet" and "Adam optimizer" but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We train the model using standard hyperparameter settings except, training it longer for 120 epochs; All methods were trained with the same set of hyperparameters: Adam optimizer learning rate 1e 4 and 500 epochs, except for BNN, which required 1000 epochs for convergence. |