When are Non-Parametric Methods Robust?
Authors: Robi Bhattacharjee, Kamalika Chaudhuri
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical results are, by nature, large sample; we next validate how well they apply to the finite sample case by trying them out on a simple example. In particular, we ask the following question: How does the robustness of non-parametric classifiers change with increasing sample size? This question is considered in the context of two simple non-parametric classifiers one nearest neighbor (which is guaranteed to be r-consistent) and histograms (which is not). To be able to measure performance with increasing data size, we look at a simple synthetic dataset the Half Moons. |
| Researcher Affiliation | Academia | Robi Bhattacharjee * 1 Kamalika Chaudhuri * 1 1Department of Computer Science, Uni versity of California, San Diego. Correspondence to: Robi <rcb hatta@eng.ucsd.edu>. |
| Pseudocode | Yes | Algorithm 1 Robust Non Par Input: S Dn, weight function W , robustness radius r Sr Adv P run(S, r) Output: WSr |
| Open Source Code | No | The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper mentions using the "Halfmoon dataset" but does not provide a specific link, DOI, repository name, or a formal citation with author names and year in brackets or parentheses for accessing it. |
| Dataset Splits | No | The paper discusses training set size and test examples, but does not provide explicit percentages or counts for training, validation, and test splits, nor does it reference predefined splits with formal citations that would enable reproduction. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., specific Python library versions, framework versions, or solver versions) that would be needed for reproducibility. |
| Experiment Setup | Yes | We use the Halfmoon dataset with two settings of the gaussian noise parameter σ, σ = 0 (Noiseless) and σ = 0.08 (Noisy). For the Noiseless set ting, observe that the data is already 0.1-separated; for the Noisy setting, we use Adversarial Pruning (Algorithm 1) with parameter r = 0.1 for both classification methods. |