reproducibilityindex.ai

Consistent Binary Classification with Generalized Performance Metrics

Authors: Oluwasanmi O Koyejo, Nagarajan Natarajan, Pradeep K Ravikumar, Inderjit S Dhillon

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present empirical comparisons between these algorithms on benchmark datasets.
Researcher Affiliation	Academia	Oluwasanmi Koyejo Department of Psychology, Stanford University sanmi@stanford.edu Nagarajan Natarajan Department of Computer Science, University of Texas at Austin naga86@cs.utexas.edu Pradeep Ravikumar Department of Computer Science, University of Texas at Austin pradeepr@cs.utexas.edu Inderjit S. Dhillon Department of Computer Science, University of Texas at Austin inderjit@cs.utexas.edu
Pseudocode	Yes	Algorithm 1: Two-Step EUM; Algorithm 2: Weighted ERM
Open Source Code	No	The paper does not provide an explicit statement or link indicating that its source code is open-source or publicly available.
Open Datasets	Yes	We present experiments on synthetic data... We also compare the two proposed algorithms on benchmark datasets... REUTERS (obtained the processed dataset from [20]). ... LETTERS dataset... SCENE dataset (UCI benchmark)... WEBPAGE binary text categorization dataset obtained from [21]... IMAGE (2068 images...)[22], BREAST CANCER (683 instances...) and SPAMBASE (4601 instances...)[23].
Dataset Splits	Yes	We split the training data S into two sets S1 and S2: S1 is used for estimating x and S2 for selecting δ. ... LETTERS dataset consisting of 20000 handwritten letters (16000 training and 4000 test instances). ... SCENE dataset (1137 training and 1093 test instances). ... WEBPAGE (6956 train and 27824 test). ... IMAGE (1300 train, 1010 test). ... BREAST CANCER (463 train, 220 test). ... SPAMBASE (3071 train, 1530 test).
Hardware Specification	No	The paper does not provide specific details about the hardware used for the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper mentions software components like 'logistic loss', 'L2 regularization', and 'weighted logistic regression' but does not specify their version numbers or list other required software dependencies with versions.
Experiment Setup	No	The paper mentions 'L2 regularization' and 'grid the space [0, 1] to ﬁnd the best threshold' but does not provide specific hyperparameter values (e.g., regularization strength, learning rates, number of epochs) or detailed system-level training settings.