Transitional Uncertainty with Layered Intermediate Predictions

Authors: Ryan Benkert, Mohit Prabhushankar, Ghassan Alregib

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that TULIP matches or outperforms current single-pass methods on standard benchmarks and in practical settings where these methods are less reliable (imbalances, complex architectures, medical modalities).
Researcher Affiliation Academia 1School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, USA.
Pseudocode Yes In addition to our description in the main paper, we provide implementation details and algorithm pseudo code in Appendix D.
Open Source Code No When a implementation was publicly available, we heavily relied on it in our own code. This is the case for DUQ (https://github.com/y0ast/deterministic-uncertaintyquantification), and SNGP (https://github.com/google/uncertainty-baselines/blob/master/baselines/imagenet/sngp.py, as well as https://github.com/y0ast/DUE).
Open Datasets Yes The following combinations are evaluated: CIFAR10 vs. CIFAR10-C/CIFAR100-C/SVHN and CIFAR100 vs. CIFAR10-C/CIFAR100-C/SVHN (Krizhevsky et al., 2009; Netzer et al., 2011; Hendrycks & Dietterich, 2019).
Dataset Splits Yes During training, the shallow-deep network exits are trained jointly with the feed-forward component, while the combination head is fitted after optimization on a validation set extracted from the training data XID.
Hardware Specification Yes For all of our experiments we use a single NVIDIA Ge Force GTX 1080 Ti.
Software Dependencies No All experiments are implemented with pytorch.
Experiment Setup Yes In all experiments, we train a resnet-18 architecture (He et al., 2016) over 200 epochs and optimize with stochastic gradient descent with a learning rate of 0.01. We further decrease the learning rate by a factor of 0.2 in epochs 100, 125, 150, and 175 respectively, and use the data augmentations random crop, random horizontal flip, and cutout to increase the generalization performance.