Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Conservative Uncertainty Estimation By Fitting Prior Networks
Authors: Kamil Ciosek, Vincent Fortuin, Ryota Tomioka, Katja Hofmann, Richard Turner
ICLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide experimental evaluation of random priors on calibration and out-of-distribution detection on typical computer vision tasks, demonstrating that they outperform deep ensembles in practice. |
| Researcher Affiliation | Collaboration | 1. Microsoft Research Cambridge; 2. ETH Zurich; 3. University of Cambridge. |
| Pseudocode | Yes | Algorithm 1 Training the predictors. function TRAIN-UNCERTAINTIES(X) for i = 1 . . . B do f i {f(x)} random prior h Xf i FIT(X, f i(X)) end for return fi, h Xf i end function function FIT(X, f i(X)) x X f i(x) h(x) 2 h Xf i OPTIMIZE(L) SGD or similar return h Xf i return trained predictor end function |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code related to the described methodology. |
| Open Datasets | Yes | All methods were trained on four classes from the CIFAR-10 (Krizhevsky et al., 2009) dataset (training details are provided in Appendix A). |
| Dataset Splits | No | The paper mentions training and testing on datasets but does not explicitly state the use of a validation set or provide its split percentages/counts for reproducibility. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU, CPU models) used for running the experiments. |
| Software Dependencies | No | The paper mentions the 'Adam optimizer (Kingma & Ba, 2014)' and 'roc auc score function from the Python package sklearn (Pedregosa et al., 2011)' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | We optimized the initialization scale of our networks as a hyperparameter on the grid {0.01, 0.1, 1.0, 2.0, 10.0} and chose 2.0. We chose a scaling factor of β = 1.0 for the uncertainty bonus of the random priors and fixed it for all experiments. |