reproducibilityindex.ai

Accurate Uncertainties for Deep Learning Using Calibrated Regression

Authors: Volodymyr Kuleshov, Nathan Fenner, Stefano Ermon

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proposed algorithm on a range of Bayesian models, including Bayesian linear regression as well as feedforward and recurrent Bayesian neural networks. Our method consistently produces well-calibrated conﬁdence estimates, which are in turn useful for several tasks in time series forecasting and model-based reinforcement learning.
Researcher Affiliation	Collaboration	1Stanford University, Stanford, California 2Afresh Technologies, San Francisco, California.
Pseudocode	Yes	Algorithm 1 Recalibration of Regression Models. Input: Uncalibrated model H : X (Y [0, 1]) and calibration set S = {(xt, yt)}T t=1. Output: Auxiliary recalibration model R : [0, 1] [0, 1].
Open Source Code	No	The paper does not provide any specific links or explicit statements about the public release of the source code for the described methodology.
Open Datasets	Yes	Datasets. We use eight UCI datasets varying in size from 194 to 8192 examples; examples carry between 6 and 159 continuous features.
Dataset Splits	No	There is generally no standard train/test split, hence we randomly assign 25% of each dataset for testing, and use the rest for training.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., CPU/GPU models, cloud instance types) used for running experiments.
Software Dependencies	No	The paper mentions various models and techniques (e.g., Bayesian Ridge Regression, dropout, Concrete dropout, isotonic regression, GRU, Dense Net) but does not provide specific version numbers for any software libraries or frameworks used.
Experiment Setup	Yes	In UCI experiments, the feedforward neural network has two layers of 128 hidden units with a dropout rate of 0.5 and parametric Re LU non-linearities. Recurrent networks are based on a standard GRU architecture with two stacked layers and a recurrent dropout of 0.5 (Gal and Ghahramani, 2016b).