reproducibilityindex.ai

Calibration by Distribution Matching: Trainable Kernel Calibration Metrics

Authors: Charlie Marx, Sofian Zalouk, Stefano Ermon

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical evaluation demonstrates that employing these metrics as regularizers enhances calibration, sharpness, and decision-making across a range of regression and classification tasks, outperforming methods relying solely on post-hoc recalibration. and 6 Experiments
Researcher Affiliation	Academia	Charles Marx Stanford University ctmarx@cs.stanford.edu Sofian Zalouk Stanford University szalouk@stanford.edu Stefano Ermon Stanford University ermon@cs.stanford.edu
Pseudocode	No	The paper does not contain any explicitly labeled pseudocode or algorithm blocks. Methodological steps are described in prose.
Open Source Code	Yes	Code to reproduce experiments can be found at https://github.com/kernel-calibration/kernel-calibration/.
Open Datasets	Yes	We use four tabular UCI datasets (SUPERCONDUCTIVITY [16], CRIME [34], BLOG [6], FB-COMMENT [41]), as well as the Medical Expenditure Panel Survey dataset (MEDICAL-EXPENDITURE [7]). and We use five tabular UCI datasets: BREAST-CANCER [49], HEART-DISEASE [18], ONLINE-SHOPPERS [40], DRY-BEAN [1], and ADULT [2].
Dataset Splits	Yes	For each dataset, we randomly assign 70% of the dataset for training, 10% for validation, and 20% for testing.
Hardware Specification	Yes	All experiments were conducted on a single CPU machine (Intel(R) Xeon(R) Gold 6342 CPU @ 2.80GHz), utilizing 8 cores per experiment. and To accelerate training, we have also run some experiments using an 11GB NVIDIA Ge Force GTX 1080 Ti.
Software Dependencies	No	The paper mentions software like Python and PyTorch generally, but does not provide specific version numbers for these or other key software components used in the experiments.
Experiment Setup	Yes	For all experiments, we vary: Layer sizes between 32 and 512 RBF kernel bandwidths between 0.001 and 200 Batch sizes between 16 and 512, with and without batch normalization Learning rates between 10 7 and 10 1. The loss mixture weight λ (as in NLL +λ MMD and XE +λ MMD) between 0.1 and 1000.