reproducibilityindex.ai

Optimal Binary Classification Beyond Accuracy

Authors: Shashank Singh, Justin T. Khim

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further illustrate these contributions numerically in the case of k-nearest neighbor classification. We provide two numerical experiments to illustrate our results from Sections 4 and 5.
Researcher Affiliation	Collaboration	Shashank Singh Max Planck Institute for Intelligent Systems Tübingen, Germany shashankssingh44@gmail.com Justin Khim Amazon New York, NY jkhim@amazon.com
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Python implementations and instructions for reproducing each experiment can be found at https://gitlab.tuebingen.mpg.de/shashank/ imbalanced-binary-classification-experiments.
Open Datasets	No	The experiments use synthetic data generated based on specified distributions, not publicly available datasets with concrete access information. For instance: 'Suppose X r0, 1s, over which X is uniformly distributed... we drew n independent samples of p X, Y q according the above distribution.'
Dataset Splits	No	The paper mentions generating 'n independent samples' for training and '1000 more independently generated test samples' for evaluation, but does not specify explicit training/validation/test dataset splits with percentages or counts for a separate validation set.
Hardware Specification	No	The paper states that hardware specifications are included in Appendix E ('Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Appendix E.'), but Appendix E is not provided in the given text.
Software Dependencies	No	The paper mentions 'Python s scikit-learn package' but does not specify version numbers for any software dependencies.
Experiment Setup	Yes	Using this training data, we selected optimal deterministic and stochastic thresholds t P r0, 1s and pt, pq P r0, 1s2 for the k NN classifier by maximizing Mp p Cq over 104 uniformly spaced values in r0, 1s and r0, 1s2, respectively. Since, in this example, α d 1, we set k tn2{3u as suggested by Theorem 16.