reproducibilityindex.ai

Online Platt Scaling with Calibeating

Authors: Chirag Gupta, Aaditya Ramdas

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, it is effective on a range of synthetic and real-world datasets, with and without distribution drifts, achieving superior performance without hyperparameter tuning.
Researcher Affiliation	Academia	1Carnegie Mellon University, Pittsburgh PA, USA.
Pseudocode	Yes	Algorithm 1 in the Appendix contains pseudocode for our ﬁnal OPS implementation.
Open Source Code	Yes	Code to reproduce the experiments can be found at https://github.com/aigen/ df-posthoc-calibration (see Appendix A.4 for more details).
Open Datasets	Yes	We worked with four public datasets in two settings. Links to the datasets are in Appendix A.1. ... Table 2: Metadata for datasets used in Section 4.1.
Dataset Splits	No	The paper describes training data, and a 'test-stream' which is also used for 'recalibration' (calibration data for online learning), but it does not explicitly define a separate 'validation' dataset split for purposes like hyperparameter tuning or early stopping.
Hardware Specification	Yes	For computation, we used allocation CIS220171 from the Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support (ACCESS) program, supported by NSF grants 2138259, 2138286, 2138307, 2137603, and 2138296. Speciﬁcally, we used the Bridges2 system (Towns et al., 2014), supported by NSF grant 1928147, at the Pittsburgh Supercomputing Center (PSC).
Software Dependencies	No	the base model f was a random forest (sklearn s implementation).
Experiment Setup	Yes	Thus we used ONS for experiments based on a verbatim implementation of Algorithm 12 in Hazan (2016), with γ 0.1, ρ 100, and K tpa, bq : }pa, bq}2 ď 100u. ... All default parameters were used, except n estimators was set to 1000. No hyperparameter tuning on individual datasets was performed for any of the recalibration methods.