Cryptographic Hardness of Learning Halfspaces with Massart Noise

Authors: Ilias Diakonikolas, Daniel Kane, Pasin Manurangsi, Lisheng Ren

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study the complexity of PAC learning halfspaces in the presence of Massart noise. In this problem, we are given i.i.d. labeled examples (x, y) RN { 1}, where the distribution of x is arbitrary and the label y is a Massart corruption of f(x), for an unknown halfspace f : RN { 1}, with flipping probability η(x) η < 1/2. The goal of the learner is to compute a hypothesis with small 0-1 error. Our main result is the first computational hardness result for this learning problem. Specifically, assuming the (widely believed) subexponential-time hardness of the Learning with Errors (LWE) problem, we show that no polynomialtime Massart halfspace learner can achieve error better than Ω(η), even if the optimal 0-1 error is small, namely OPT = 2 logc(N) for any universal constant c (0, 1).
Researcher Affiliation Collaboration Ilias Diakonikolas UW Madison ilias@cs.wisc.edu Daniel M. Kane UC San-Diego dakane@ucsd.edu Pasin Manurangsi Google Research pasin@google.com Lisheng Ren UW Madison lren29@wisc.edu
Pseudocode Yes Algorithm 1 Rejection Sampling Algorithm Algorithm 2 Reducing LWE to Learning PTFs with Massart Noise
Open Source Code No The paper does not provide any links to open-source code or explicit statements about code availability for the described methodology. The checklist item 3a is marked N/A.
Open Datasets No This is a theoretical paper focusing on computational hardness proofs and theoretical models (PAC learning, LWE), not empirical studies involving datasets or training. No real-world datasets are used or referenced for training.
Dataset Splits No This is a theoretical paper. It does not conduct experiments on datasets that would require training, validation, or test splits.
Hardware Specification No This is a theoretical paper. No experiments requiring specific hardware were conducted or described. The checklist item 3d is marked N/A.
Software Dependencies No This is a theoretical paper. No software implementation details with version numbers are provided or necessary for the theoretical results. The checklist item 3b is marked N/A.
Experiment Setup No This is a theoretical paper. No empirical experiments or their setup details, such as hyperparameters or training configurations, are provided. The checklist item 3b is marked N/A.