Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Cryptographic Hardness of Learning Halfspaces with Massart Noise

Authors: Ilias Diakonikolas, Daniel Kane, Pasin Manurangsi, Lisheng Ren

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study the complexity of PAC learning halfspaces in the presence of Massart noise. In this problem, we are given i.i.d. labeled examples (x, y) RN { 1}, where the distribution of x is arbitrary and the label y is a Massart corruption of f(x), for an unknown halfspace f : RN { 1}, with flipping probability η(x) η < 1/2. The goal of the learner is to compute a hypothesis with small 0-1 error. Our main result is the first computational hardness result for this learning problem. Specifically, assuming the (widely believed) subexponential-time hardness of the Learning with Errors (LWE) problem, we show that no polynomialtime Massart halfspace learner can achieve error better than Ω(η), even if the optimal 0-1 error is small, namely OPT = 2 logc(N) for any universal constant c (0, 1).
Researcher Affiliation Collaboration Ilias Diakonikolas UW Madison EMAIL Daniel M. Kane UC San-Diego EMAIL Pasin Manurangsi Google Research EMAIL Lisheng Ren UW Madison EMAIL
Pseudocode Yes Algorithm 1 Rejection Sampling Algorithm Algorithm 2 Reducing LWE to Learning PTFs with Massart Noise
Open Source Code No The paper does not provide any links to open-source code or explicit statements about code availability for the described methodology. The checklist item 3a is marked N/A.
Open Datasets No This is a theoretical paper focusing on computational hardness proofs and theoretical models (PAC learning, LWE), not empirical studies involving datasets or training. No real-world datasets are used or referenced for training.
Dataset Splits No This is a theoretical paper. It does not conduct experiments on datasets that would require training, validation, or test splits.
Hardware Specification No This is a theoretical paper. No experiments requiring specific hardware were conducted or described. The checklist item 3d is marked N/A.
Software Dependencies No This is a theoretical paper. No software implementation details with version numbers are provided or necessary for the theoretical results. The checklist item 3b is marked N/A.
Experiment Setup No This is a theoretical paper. No empirical experiments or their setup details, such as hyperparameters or training configurations, are provided. The checklist item 3b is marked N/A.