Optimal Binary Classification Beyond Accuracy
Authors: Shashank Singh, Justin T. Khim
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further illustrate these contributions numerically in the case of k-nearest neighbor classification. We provide two numerical experiments to illustrate our results from Sections 4 and 5. |
| Researcher Affiliation | Collaboration | Shashank Singh Max Planck Institute for Intelligent Systems Tübingen, Germany shashankssingh44@gmail.com Justin Khim Amazon New York, NY jkhim@amazon.com |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Python implementations and instructions for reproducing each experiment can be found at https://gitlab.tuebingen.mpg.de/shashank/ imbalanced-binary-classification-experiments. |
| Open Datasets | No | The experiments use synthetic data generated based on specified distributions, not publicly available datasets with concrete access information. For instance: 'Suppose X r0, 1s, over which X is uniformly distributed... we drew n independent samples of p X, Y q according the above distribution.' |
| Dataset Splits | No | The paper mentions generating 'n independent samples' for training and '1000 more independently generated test samples' for evaluation, but does not specify explicit training/validation/test dataset splits with percentages or counts for a separate validation set. |
| Hardware Specification | No | The paper states that hardware specifications are included in Appendix E ('Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix E.'), but Appendix E is not provided in the given text. |
| Software Dependencies | No | The paper mentions 'Python s scikit-learn package' but does not specify version numbers for any software dependencies. |
| Experiment Setup | Yes | Using this training data, we selected optimal deterministic and stochastic thresholds t P r0, 1s and pt, pq P r0, 1s2 for the k NN classifier by maximizing Mp p Cq over 104 uniformly spaced values in r0, 1s and r0, 1s2, respectively. Since, in this example, α d 1, we set k tn2{3u as suggested by Theorem 16. |