Non-Convex SGD Learns Halfspaces with Adversarial Label Noise
Authors: Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the problem of agnostically learning homogeneous halfspaces in the distribution-specific PAC model. For a broad family of structured distributions, including log-concave distributions, we show that non-convex SGD efficiently converges to a solution with misclassification error O(opt) + ϵ, where opt is the misclassification error of the best-fitting halfspace. In sharp contrast, we show that optimizing any convex surrogate inherently leads to misclassification error of ω(opt), even under Gaussian marginals. |
| Researcher Affiliation | Academia | Ilias Diakonikolas University of Wisconsin-Madison ilias@cs.wisc.edu Vasilis Kontonis University of Wisconsin-Madison ilias@cs.wisc.edu Christos Tzamos University of Wisconsin-Madison tzamos@wisc.edu Nikos Zarifis University of Wisconsin-Madison zarifis@wisc.edu |
| Pseudocode | Yes | Algorithm 1 PSGD for f(w) = Ez D[g(z, w)] |
| Open Source Code | No | The paper does not provide any statement or link for open-source code. |
| Open Datasets | No | The paper discusses theoretical distributions (e.g., log-concave, Gaussian) and concepts, but does not mention the use of any specific publicly available datasets for empirical training or evaluation, nor does it provide access information for any data. |
| Dataset Splits | No | This is a theoretical paper that does not report on empirical experiments with specific dataset splits (training, validation, test). |
| Hardware Specification | No | This is a theoretical paper and does not describe any specific hardware used for experiments. |
| Software Dependencies | No | This is a theoretical paper and does not list any specific software dependencies with version numbers for experimental reproducibility. |
| Experiment Setup | No | This is a theoretical paper and does not describe any experimental setup details such as hyperparameters or training configurations. |