Sparse Uncertainty Representation in Deep Learning with Inducing Weights
Authors: Hippolyt Ritter, Martin Kukla, Cheng Zhang, Yingzhen Li
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the inducing weight approaches on regression, classification and related uncertainty estimation tasks. The goal is to demonstrate competitive performance to popular W-space uncertainty estimation methods while using significantly fewer parameters. Experiments in classification, model robustness and out-of-distribution detection tasks show that our inducing weight approaches achieve competitive performance to their counterparts in the original weight space on modern deep architectures for image classification, while reducing the parameter count to 24.3% of that of a single network. |
| Researcher Affiliation | Collaboration | Hippolyt Ritter1 , Martin Kukla2, Cheng Zhang2 & Yingzhen Li3 1University College London 2Microsoft Research Cambridge, UK 3Imperial College London |
| Pseudocode | No | The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. It includes code snippets for a PyTorch wrapper, but these are not algorithmic pseudocode for the described methodology. |
| Open Source Code | Yes | We open-source our proposed inducing weight approach, together with baseline methods reported in the experiments, as a Py Torch (Paszke et al., 2019) wrapper named bayesianize: https: //github.com/microsoft/bayesianize. |
| Open Datasets | Yes | As the core empirical evaluation, we train Resnet-50 models (He et al., 2016b) on CIFAR-10 and CIFAR-100 (Krizhevsky et al., 2009). To investigate the models robustness to distribution shift, we compute predictions on corrupted CIFAR datasets (Hendrycks & Dietterich, 2019) after training on clean data. |
| Dataset Splits | No | The paper mentions training on CIFAR-10 and CIFAR-100 and reports test accuracy and ECE, but it does not explicitly provide the specific percentages or counts for training, validation, and test dataset splits. |
| Hardware Specification | Yes | In Fig. 3 we show prediction run-times for batch-size = 500 on an NVIDIA Tesla V100 GPU |
| Software Dependencies | No | The paper mentions 'Py Torch (Paszke et al., 2019)' but does not provide specific version numbers for PyTorch or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | In convolution layers, we treat the 4D weight tensor W of shape (cout, cin, h, w) as a cout cinhw matrix. We use U matrices of shape 64 64 for all layers (i.e. M = Min = Mout = 64), except that for CIFAR-10 we set Mout = 10 for the last layer. In Fig. 3 we show prediction run-times for batch-size = 500 on an NVIDIA Tesla V100 GPU. Hyper-parameter choices We visualise in Fig. 4 the accuracy and ECE results for computationally lighter inducing weight Res Net-18 models with different hyper-parameters (see Appendix J). Also setting proper values for λmax, σmax is key to the improved results. |