Accurate Uncertainties for Deep Learning Using Calibrated Regression
Authors: Volodymyr Kuleshov, Nathan Fenner, Stefano Ermon
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proposed algorithm on a range of Bayesian models, including Bayesian linear regression as well as feedforward and recurrent Bayesian neural networks. Our method consistently produces well-calibrated confidence estimates, which are in turn useful for several tasks in time series forecasting and model-based reinforcement learning. |
| Researcher Affiliation | Collaboration | 1Stanford University, Stanford, California 2Afresh Technologies, San Francisco, California. |
| Pseudocode | Yes | Algorithm 1 Recalibration of Regression Models. Input: Uncalibrated model H : X (Y [0, 1]) and calibration set S = {(xt, yt)}T t=1. Output: Auxiliary recalibration model R : [0, 1] [0, 1]. |
| Open Source Code | No | The paper does not provide any specific links or explicit statements about the public release of the source code for the described methodology. |
| Open Datasets | Yes | Datasets. We use eight UCI datasets varying in size from 194 to 8192 examples; examples carry between 6 and 159 continuous features. |
| Dataset Splits | No | There is generally no standard train/test split, hence we randomly assign 25% of each dataset for testing, and use the rest for training. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., CPU/GPU models, cloud instance types) used for running experiments. |
| Software Dependencies | No | The paper mentions various models and techniques (e.g., Bayesian Ridge Regression, dropout, Concrete dropout, isotonic regression, GRU, Dense Net) but does not provide specific version numbers for any software libraries or frameworks used. |
| Experiment Setup | Yes | In UCI experiments, the feedforward neural network has two layers of 128 hidden units with a dropout rate of 0.5 and parametric Re LU non-linearities. Recurrent networks are based on a standard GRU architecture with two stacked layers and a recurrent dropout of 0.5 (Gal and Ghahramani, 2016b). |