reproducibilityindex.ai

Truthfulness of Calibration Measures

Authors: Nika Haghtalab, Mingda Qiao, Kunhe Yang, Eric Zhao

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We study calibration measures in a sequential prediction setup. We introduce a new calibration measure termed the Subsampled Smooth Calibration Error (SSCE), which is complete and sound, and under which truthful prediction is optimal up to a constant multiplicative factor. We answer this question in three parts: Part I: We show that existing calibration measures do not simultaneously meet these criteria. ... Part II: We introduce a new calibration measure, called SSCE, that is sound, complete, and approximately truthful. ... Part III: There is a forecasting algorithm that achieves O(T) SSCE even in the adversarial setting.
Researcher Affiliation	Academia	Nika Haghtalab, Mingda Qiao, Kunhe Yang, and Eric Zhao University of California, Berkeley {nika,mingda.qiao,kunheyang,eric.zh}@berkeley.edu
Pseudocode	Yes	Algorithm 1: Forecaster for Product Distributions
Open Source Code	No	The paper does not contain any statement about releasing source code or links to a code repository.
Open Datasets	No	This is a theoretical paper and does not involve the use of datasets for training or evaluation.
Dataset Splits	No	This is a theoretical paper and does not describe any experimental data splits (training, validation, test).
Hardware Specification	No	This is a theoretical paper and does not describe any hardware used for experiments.
Software Dependencies	No	This is a theoretical paper and does not list any specific software dependencies with version numbers.
Experiment Setup	No	This is a theoretical paper and does not describe any experimental setup details such as hyperparameters or training configurations.