Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Testing Calibration in Nearly-Linear Time
Authors: Lunjia Hu, Arun Jambulapati, Kevin Tian, Chutong Yang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present experiments showing the testing problem we define faithfully captures standard notions of calibration, and that our algorithms scale efficiently to accommodate large sample sizes. |
| Researcher Affiliation | Academia | Lunjia Hu Harvard University EMAIL Arun Jambulapati University of Michigan EMAIL Kevin Tian University of Texas at Austin EMAIL Chutong Yang University of Texas at Austin EMAIL |
| Pseudocode | Yes | Algorithm 1 Apply(g, ℓ, r, τ) |
| Open Source Code | Yes | Our code is included in the supplementary material. |
| Open Datasets | Yes | We trained a Dense Net40 model [HLvd MW17] on the CIFAR-100 dataset [Kri09] |
| Dataset Splits | No | The paper mentions synthetic datasets, CIFAR-100, and drawing samples, but does not specify explicit train/validation/test splits, exact percentages, or sample counts for these splits. |
| Hardware Specification | Yes | The experiments in the first and third part of this section are run on a 2018 laptop with 2.2 GHz 6-Core Intel Core i7 processor. The experiments in the second part are run on a cluster using 2x AMD EPYC 7763 64-Core Processor and a single NVIDIA A100 PCIE 40GB. |
| Software Dependencies | No | The paper mentions using 'a linear program solver from CVXPY [DB16, AVDB18]' and 'a commercial minimum-cost flow solver from Gurobi Optimization [Opt23]', and 'the Py Py package [Py P19]'. However, it does not specify version numbers for these software dependencies. |
| Experiment Setup | No | The paper describes training a Dense Net40 model and learning postprocessing functions but does not explicitly provide specific hyperparameter values like learning rate, batch size, or detailed optimizer settings in the main text. |