reproducibilityindex.ai

Transitional Uncertainty with Layered Intermediate Predictions

Authors: Ryan Benkert, Mohit Prabhushankar, Ghassan Alregib

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that TULIP matches or outperforms current single-pass methods on standard benchmarks and in practical settings where these methods are less reliable (imbalances, complex architectures, medical modalities).
Researcher Affiliation	Academia	1School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, USA.
Pseudocode	Yes	In addition to our description in the main paper, we provide implementation details and algorithm pseudo code in Appendix D.
Open Source Code	No	When a implementation was publicly available, we heavily relied on it in our own code. This is the case for DUQ (https://github.com/y0ast/deterministic-uncertaintyquantification), and SNGP (https://github.com/google/uncertainty-baselines/blob/master/baselines/imagenet/sngp.py, as well as https://github.com/y0ast/DUE).
Open Datasets	Yes	The following combinations are evaluated: CIFAR10 vs. CIFAR10-C/CIFAR100-C/SVHN and CIFAR100 vs. CIFAR10-C/CIFAR100-C/SVHN (Krizhevsky et al., 2009; Netzer et al., 2011; Hendrycks & Dietterich, 2019).
Dataset Splits	Yes	During training, the shallow-deep network exits are trained jointly with the feed-forward component, while the combination head is fitted after optimization on a validation set extracted from the training data XID.
Hardware Specification	Yes	For all of our experiments we use a single NVIDIA Ge Force GTX 1080 Ti.
Software Dependencies	No	All experiments are implemented with pytorch.
Experiment Setup	Yes	In all experiments, we train a resnet-18 architecture (He et al., 2016) over 200 epochs and optimize with stochastic gradient descent with a learning rate of 0.01. We further decrease the learning rate by a factor of 0.2 in epochs 100, 125, 150, and 175 respectively, and use the data augmentations random crop, random horizontal flip, and cutout to increase the generalization performance.