Minimum-Risk Recalibration of Classifiers
Authors: Zeyu Sun, Dogyoon Song, Alfred Hero
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our theoretical findings through numerical simulations, which confirm the tightness of the proposed bounds, the optimal number of bins, and the effectiveness of label shift adaptation. |
| Researcher Affiliation | Academia | Zeyu Sun University of Michigan zeyusun@umich.edu Dogyoon Song University of Michigan dogyoons@umich.edu Alfred Hero University of Michigan hero@eecs.umich.edu |
| Pseudocode | No | The paper describes methods with numbered steps but does not explicitly present any pseudocode or labeled algorithm blocks. |
| Open Source Code | Yes | Our simulation code is available at https://github.com/Zeyu Sun/calibration_label_ shift. |
| Open Datasets | No | The paper uses simulated data for its experiments, generated from defined distributions (e.g., 'family of joint distributions D(π) of X and Y'). It does not use or provide access information for a publicly available or open dataset. |
| Dataset Splits | No | The paper specifies sample sizes for source and target data ('n P = 103 and n Q = 102') and 'calibration sample size to be n = 5000', but it does not explicitly detail train, validation, and test splits with percentages or counts for a fixed dataset. |
| Hardware Specification | No | The paper mentions 'numerical simulations' but does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper does not provide a reproducible description of ancillary software with specific version numbers for key components or libraries. |
| Experiment Setup | Yes | We vary n [102, 107] and B [6, 103] in the log scale. For each combination of (n, B), we use UMB to recalibrate f on data generated from D(0.5), and compute quadrature estimates of population Rcal(ˆh), Rsha(ˆh), and R(ˆh), as well as their high probability bounds based on Theorem 1. The constant K in Assumption (A3) is selected by numerical maximization. For each setting, we fix calibration sample size to be n = 5000. We consider the label shift with source distribution D(0.5) and target distribution D(πQ), where πQ varies in {0.01, 0.05, 0.1, 0.2, 0.3, 0.4, 0.5}. We vary n P in {10, 103, 105, 107} and n Q in {10, 103, 105}. The number of bins B are chosen to be n1/3 P for COMPOSITE and SOURCE, and n1/3 Q for TARGET. |