Consistency Analysis for Binary Classification Revisited

Authors: Krzysztof Dembczyński, Wojciech Kotłowski, Oluwasanmi Koyejo, Nagarajan Natarajan

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5. Experiments We empirically evaluate the effectiveness and accuracy of ETU approximations introduced in Section 4.1, on synthetic as well as real datasets. We also show on several benchmark datasets that, by carefully calibrating the conditional probabilities in ETU, we can improve the classification performance.
Researcher Affiliation Collaboration 1Institute of Computing Science, Poznan University of Technology, Poland 2Department of Computer Science, University of Illinois at Urbana Champaign, USA 3Microsoft Research, India.
Pseudocode Yes Algorithm 1 Approximate ETU Consistent Classifier
Open Source Code No No explicit statement about the release of open-source code for the methodology described in this paper, nor a link to such code, was found.
Open Datasets Yes We report results on seven multiclass and multilabel benchmark datasets: (1) LETTERS: 16000 train, 4000 test instances, (2) SCENE: 1137 train, 1093 test (3) YEAST: 1500 train, 917 test (4) WEBPAGE: 6956 train, 27824 test (5) IMAGE: 1300 train, 1010 test (6) BREAST CANCER: 463 train, 220 test instances, (7) SPAMBASE: 3071 train, 1530 test instances.5 See (Koyejo et al., 2014b; Ye et al., 2012) for details.
Dataset Splits Yes We report results on seven multiclass and multilabel benchmark datasets: (1) LETTERS: 16000 train, 4000 test instances, (2) SCENE: 1137 train, 1093 test (3) YEAST: 1500 train, 917 test (4) WEBPAGE: 6956 train, 27824 test (5) IMAGE: 1300 train, 1010 test (6) BREAST CANCER: 463 train, 220 test instances, (7) SPAMBASE: 3071 train, 1530 test instances. ... one uses a validation sample S = {(xi, yi)}n i=1 to choose a threshold on bη(x)
Hardware Specification No No specific hardware details (such as GPU/CPU models, memory, or specific computing environments) used for experiments were mentioned in the paper.
Software Dependencies No No specific software dependencies with version numbers were mentioned. The paper describes algorithms and models (e.g., logistic regression, Isotron) but not the specific software environments or libraries used for implementation or experiments.
Experiment Setup No No specific experimental setup details such as hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or detailed training configurations were provided in the main text. The paper mentions using 'standard logistic regression' but without concrete settings.