Optimal Binary Classification Beyond Accuracy

Authors: Shashank Singh, Justin T. Khim

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We further illustrate these contributions numerically in the case of k-nearest neighbor classification. We provide two numerical experiments to illustrate our results from Sections 4 and 5.
Researcher Affiliation Collaboration Shashank Singh Max Planck Institute for Intelligent Systems Tübingen, Germany shashankssingh44@gmail.com Justin Khim Amazon New York, NY jkhim@amazon.com
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Python implementations and instructions for reproducing each experiment can be found at https://gitlab.tuebingen.mpg.de/shashank/ imbalanced-binary-classification-experiments.
Open Datasets No The experiments use synthetic data generated based on specified distributions, not publicly available datasets with concrete access information. For instance: 'Suppose X r0, 1s, over which X is uniformly distributed... we drew n independent samples of p X, Y q according the above distribution.'
Dataset Splits No The paper mentions generating 'n independent samples' for training and '1000 more independently generated test samples' for evaluation, but does not specify explicit training/validation/test dataset splits with percentages or counts for a separate validation set.
Hardware Specification No The paper states that hardware specifications are included in Appendix E ('Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix E.'), but Appendix E is not provided in the given text.
Software Dependencies No The paper mentions 'Python s scikit-learn package' but does not specify version numbers for any software dependencies.
Experiment Setup Yes Using this training data, we selected optimal deterministic and stochastic thresholds t P r0, 1s and pt, pq P r0, 1s2 for the k NN classifier by maximizing Mp p Cq over 104 uniformly spaced values in r0, 1s and r0, 1s2, respectively. Since, in this example, α d 1, we set k tn2{3u as suggested by Theorem 16.