Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Uncertainty Quantification via Stable Distribution Propagation
Authors: Felix Petersen, Aashwin Ananda Mishra, Hilde Kuehne, Christian Borgelt, Oliver Deussen, Mikhail Yurochkin
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To empirically validate SDP, we (i) compare it to other distribution propagation approaches in a variety of settings covering total variation (TV) distance and Wasserstein distance; (ii) compare it to other uncertainty quantification methods on 8 UCI [25] regression tasks; and (iii) demonstrate the utility of Cauchy distribution propagation in selective prediction on MNIST [26] and EMNIST [27]. |
| Researcher Affiliation | Collaboration | Felix Petersen1, Aashwin Mishra1, Hilde Kuehne2,3, Christian Borgelt4, Oliver Deussen5, Mikhail Yurochkin3 1Stanford University, 2University of Bonn, 3MIT-IBM Watson AI Lab, 4University of Salzburg, 5University of Konstanz, EMAIL |
| Pseudocode | Yes | We provide pseudo-code and Py Torch implementations of SDP in SM D. |
| Open Source Code | Yes | 1The code is publicly available at github.com/Felix-Petersen/distprop. |
| Open Datasets | Yes | 8 UCI [25] regression tasks, selective prediction on MNIST [26] and EMNIST [27], CIFAR-10 Res Net-18 [46] model. |
| Dataset Splits | Yes | In Tab. 4, following [9], we report the test PICP and MPIW of those models where the validation PICP lies between 92.5% and 97.5% using the evaluation code provided by Tagasovska et al. [9]. |
| Hardware Specification | Yes | Times per epoch on CIFAR-10 with a batch size of 128 on a single V100 GPU. |
| Software Dependencies | Yes | tested with Py Torch version 1.13.1 |
| Experiment Setup | Yes | That is, we use a network with 1 Re LU activated hidden layer, with 64 hidden neurons and train it for 5000 epochs. We perform this for 20 seeds and for a learning rate η {10 2, 10 3, 10 4} and weight decay {0, 10 3, 10 2, 10 1, 1}. For the input standard deviation, we made a single initial run with input variance σ2 {10 8, 10 7, 10 6, 10 5, 10 4, 10 3, 10 2, 10 1, 100} and then (for each data set) used 11 variances at a resolution of 100.1 around the best initial variance. |