Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Consistency Conditions for Differentiable Surrogate Losses

Authors: Drona Khurana, Anish Thilagar, Dhamma Kimpara, Rafael Frongillo

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We first prove that under mild conditions, IE and calibration are equivalent for one-dimensional losses in this class. We construct a counter-example that shows that this equivalence fails in higher dimensions. This motivates the introduction of strong IE, a strengthened form of IE that is equally easy to verify. We establish that strong IE implies calibration for differentiable surrogates and is both necessary and sufficient for strongly convex, differentiable surrogates. Finally, we apply these results to a range of problems to demonstrate the power of IE and strong IE for designing and analyzing consistent differentiable surrogates. Theoretical Contributions. We first show that IE and calibration are equivalent for 1-d convex, differentiable losses ( 3).2 In higher dimensions, however, IE no longer implies calibration, even for strongly convex surrogates (Example 2). To address this disparity, we propose a novel strengthening of IE we call strong indirect elicitation (strong IE; see Definition 6 in 3.4). We prove that under mild technical assumptions, strong IE implies calibration for differentiable surrogates (Theorem 2). Moreover, for the important class of strongly convex, differentiable surrogates, we show that strong IE is both necessary and sufficient for calibration (Theorem 3).
Researcher Affiliation	Academia	Drona Khurana University of Colorado Boulder EMAIL Anish Thilagar University of Colorado Boulder EMAIL Dhamma Kimpara NSF National Center for Atmospheric Research Boulder, Colorado EMAIL Rafael Frongillo University of Colorado Boulder EMAIL
Pseudocode	Yes	Subroutine 1 Lin Int Grad(X) Construction 2 SURROGATE CONSTRUCTION
Open Source Code	No	Answer: [NA] Justification: There is no data or code.
Open Datasets	No	Answer: [NA] Justification: There is no data or code. Justification: We use no data.
Dataset Splits	No	Answer: [NA] Justification: We use no data.
Hardware Specification	No	Answer: [NA] Justification: We just used our brains.
Software Dependencies	No	Answer: [NA] Justification: We use no data.
Experiment Setup	No	Answer: [NA] Justification: We use no data.