Distribution calibration for regression
Authors: Hao Song, Tom Diethe, Meelis Kull, Peter Flach
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed method is experimentally veriļ¬ed on a set of common regression models and shows improvements for both distributionlevel and quantile-level calibration. (Abstract) |
| Researcher Affiliation | Collaboration | 1University of Bristol, Bristol, United Kingdom 2Amazon Research, Cambridge, United Kingdom 3University of Tartu, Tartu, Estonia 4The Alan Turing Institute, London, United Kingdom. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. |
| Open Datasets | Yes | The experiments are applied on the following UCI datasets (sizes in parentheses): 1. Diabetes (442), 2. Boston (506), 3. Airfoil (1503), 4. Forest Fire (517), 5. Strength (1030), 6. Energy (19735). (Section 5) |
| Dataset Splits | No | All the experiments use a random (0.75, 0.25) train-test split, with both the base model and calibrators trained on the same set. (Section 5) |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'ADAM optimiser' but does not specify software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, PyTorch 1.9). |
| Experiment Setup | Yes | The model is trained using the ADAM optimiser with a learning rate of 0.01. (Section 5, Synthetic data) [...] GP-BETA with 8, 16, 32 and 64 inducing points, batch size of 128, and 64 Monte-Carlo samples per batch to compute the objective function and the gradient. The parameters are again optimised using ADAM with a learning rate of 0.001. For the NNs, we use the same setting as in [14], which is a 2-layer fully-connected structure with 128 hidden units per layer and Re LU activation. The dropout rate is set to 0.5, default weight decay of 10 4 and the length scale of 1.0 are used to approximate the mean and variance following the results given in [5]. (Section 5, Real world datasets) |