reproducibilityindex.ai

Obtaining Well Calibrated Probabilities Using Bayesian Binning

Authors: Mahdi Pakdaman Naeini, Gregory Cooper, Milos Hauskrecht

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The method is computationally tractable, and empirically accurate, as evidenced by the set of experiments reported here on both real and simulated datasets. This section describes the set of experiments that we performed to evaluate the performance of the proposed calibration method in comparison to other commonly used calibration methods: histogram binning, Platt s method, and isotonic regression.
Researcher Affiliation	Academia	Mahdi Pakdaman Naeini1, Gregory F. Cooper1,2, and Milos Hauskrecht1,3 1Intelligent Systems Program, University of Pittsburgh, PA, USA 2Department of Biomedical Informatics, University of Pittsburgh, PA, USA 3Computer Science Department, University of Pittsburgh, PA, USA
Pseudocode	No	The paper describes the mathematical formulation of the BBQ method and evaluation measures but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	An implementation of BBQ method can be found at the following address: https://github.com/pakdaman/calibration.git
Open Datasets	Yes	In terms of real data, we used 30 different real world binary classiﬁcation data sets from the UCI and Lib SVM repository 4 (Bache and Lichman 2013; Chang and Lin 2011).
Dataset Splits	No	The data were divided into 1000 instances for training and calibrating the prediction model, and 1000 instances for testing the models. The paper mentions training and testing data splits but does not explicitly define a separate validation dataset split.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or cloud computing specifications.
Software Dependencies	No	The paper mentions using classifiers like Logistic Regression, SVM, and Naive Bayes, and refers to the UCI and Lib SVM repositories for datasets. However, it does not specify any software names with version numbers for libraries, frameworks, or programming languages used in the experiments.
Experiment Setup	Yes	We deﬁne the range of possible values of the number of bins as B { 3 N C , . . . , C 3 N}, where C is a constant that controls the number of binning models (C = 10 in our experiments). We set N = 2 in our experiments. We ﬁx a small number ρ > 0 (ρ = 0.001 in our experiments). In computing these measures, the predictions are sorted and partitioned into K ﬁxed number of bins (K = 10 in our experiments).