Consistent Binary Classification with Generalized Performance Metrics

Authors: Oluwasanmi O Koyejo, Nagarajan Natarajan, Pradeep K Ravikumar, Inderjit S Dhillon

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present empirical comparisons between these algorithms on benchmark datasets.
Researcher Affiliation Academia Oluwasanmi Koyejo Department of Psychology, Stanford University sanmi@stanford.edu Nagarajan Natarajan Department of Computer Science, University of Texas at Austin naga86@cs.utexas.edu Pradeep Ravikumar Department of Computer Science, University of Texas at Austin pradeepr@cs.utexas.edu Inderjit S. Dhillon Department of Computer Science, University of Texas at Austin inderjit@cs.utexas.edu
Pseudocode Yes Algorithm 1: Two-Step EUM; Algorithm 2: Weighted ERM
Open Source Code No The paper does not provide an explicit statement or link indicating that its source code is open-source or publicly available.
Open Datasets Yes We present experiments on synthetic data... We also compare the two proposed algorithms on benchmark datasets... REUTERS (obtained the processed dataset from [20]). ... LETTERS dataset... SCENE dataset (UCI benchmark)... WEBPAGE binary text categorization dataset obtained from [21]... IMAGE (2068 images...)[22], BREAST CANCER (683 instances...) and SPAMBASE (4601 instances...)[23].
Dataset Splits Yes We split the training data S into two sets S1 and S2: S1 is used for estimating x and S2 for selecting δ. ... LETTERS dataset consisting of 20000 handwritten letters (16000 training and 4000 test instances). ... SCENE dataset (1137 training and 1093 test instances). ... WEBPAGE (6956 train and 27824 test). ... IMAGE (1300 train, 1010 test). ... BREAST CANCER (463 train, 220 test). ... SPAMBASE (3071 train, 1530 test).
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies No The paper mentions software components like 'logistic loss', 'L2 regularization', and 'weighted logistic regression' but does not specify their version numbers or list other required software dependencies with versions.
Experiment Setup No The paper mentions 'L2 regularization' and 'grid the space [0, 1] to find the best threshold' but does not provide specific hyperparameter values (e.g., regularization strength, learning rates, number of epochs) or detailed system-level training settings.