Testing Calibration in Nearly-Linear Time
Authors: Lunjia Hu, Arun Jambulapati, Kevin Tian, Chutong Yang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present experiments showing the testing problem we define faithfully captures standard notions of calibration, and that our algorithms scale efficiently to accommodate large sample sizes. |
| Researcher Affiliation | Academia | Lunjia Hu Harvard University lunjia@alumni.stanford.edu Arun Jambulapati University of Michigan jmblpati@gmail.com Kevin Tian University of Texas at Austin kjtian@cs.utexas.edu Chutong Yang University of Texas at Austin cyang98@utexas.edu |
| Pseudocode | Yes | Algorithm 1 Apply(g, ℓ, r, τ) |
| Open Source Code | Yes | Our code is included in the supplementary material. |
| Open Datasets | Yes | We trained a Dense Net40 model [HLvd MW17] on the CIFAR-100 dataset [Kri09] |
| Dataset Splits | No | The paper mentions synthetic datasets, CIFAR-100, and drawing samples, but does not specify explicit train/validation/test splits, exact percentages, or sample counts for these splits. |
| Hardware Specification | Yes | The experiments in the first and third part of this section are run on a 2018 laptop with 2.2 GHz 6-Core Intel Core i7 processor. The experiments in the second part are run on a cluster using 2x AMD EPYC 7763 64-Core Processor and a single NVIDIA A100 PCIE 40GB. |
| Software Dependencies | No | The paper mentions using 'a linear program solver from CVXPY [DB16, AVDB18]' and 'a commercial minimum-cost flow solver from Gurobi Optimization [Opt23]', and 'the Py Py package [Py P19]'. However, it does not specify version numbers for these software dependencies. |
| Experiment Setup | No | The paper describes training a Dense Net40 model and learning postprocessing functions but does not explicitly provide specific hyperparameter values like learning rate, batch size, or detailed optimizer settings in the main text. |