Online Platt Scaling with Calibeating
Authors: Chirag Gupta, Aaditya Ramdas
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, it is effective on a range of synthetic and real-world datasets, with and without distribution drifts, achieving superior performance without hyperparameter tuning. |
| Researcher Affiliation | Academia | 1Carnegie Mellon University, Pittsburgh PA, USA. |
| Pseudocode | Yes | Algorithm 1 in the Appendix contains pseudocode for our final OPS implementation. |
| Open Source Code | Yes | Code to reproduce the experiments can be found at https://github.com/aigen/ df-posthoc-calibration (see Appendix A.4 for more details). |
| Open Datasets | Yes | We worked with four public datasets in two settings. Links to the datasets are in Appendix A.1. ... Table 2: Metadata for datasets used in Section 4.1. |
| Dataset Splits | No | The paper describes training data, and a 'test-stream' which is also used for 'recalibration' (calibration data for online learning), but it does not explicitly define a separate 'validation' dataset split for purposes like hyperparameter tuning or early stopping. |
| Hardware Specification | Yes | For computation, we used allocation CIS220171 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, supported by NSF grants 2138259, 2138286, 2138307, 2137603, and 2138296. Specifically, we used the Bridges2 system (Towns et al., 2014), supported by NSF grant 1928147, at the Pittsburgh Supercomputing Center (PSC). |
| Software Dependencies | No | the base model f was a random forest (sklearn s implementation). |
| Experiment Setup | Yes | Thus we used ONS for experiments based on a verbatim implementation of Algorithm 12 in Hazan (2016), with γ 0.1, ρ 100, and K tpa, bq : }pa, bq}2 ď 100u. ... All default parameters were used, except n estimators was set to 1000. No hyperparameter tuning on individual datasets was performed for any of the recalibration methods. |