Hardness of Noise-Free Learning for Two-Hidden-Layer Neural Networks

Authors: Sitan Chen, Aravind Gollakota, Adam Klivans, Raghu Meka

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We give superpolynomial statistical query (SQ) lower bounds for learning twohidden-layer Re LU networks with respect to Gaussian inputs in the standard (noise-free) model. No general SQ lower bounds were known for learning Re LU networks of any depth in this setting: previous SQ lower bounds held only for adversarial noise models (agnostic learning) [KK14, GGK20, DKZ20] or restricted models such as correlational SQ [GGJ+20, DKKZ20]. Prior work hinted at the impossibility of our result: Vempala and Wilmes [VW19] showed that general SQ lower bounds cannot apply to any real-valued family of functions that satisfies a simple non-degeneracy condition. To circumvent their result, we refine a lifting procedure due to Daniely and Vardi [DV21] that reduces Boolean PAC learning problems to Gaussian ones. We show how to extend their technique to other learning models and, in many well-studied cases, obtain a more efficient reduction. As such, we also prove new cryptographic hardness results for PAC learning twohidden-layer Re LU networks, as well as new lower bounds for learning constantdepth Re LU networks from label queries. (...) 2. If you are including theoretical results... (a) Did you state the full set of assumptions of all theoretical results? [Yes] (b) Did you include complete proofs of all theoretical results? [Yes] All proofs are present in either the main body or the forthcoming supplement. 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]
Researcher Affiliation Academia Sitan Chen UC Berkeley sitanc@berkeley.edu Aravind Gollakota UT Austin aravindg@cs.utexas.edu Adam R. Klivans UT Austin klivans@cs.utexas.edu Raghu Meka UCLA raghum@cs.ucla.edu
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. It presents mathematical theorems, lemmas, and proofs.
Open Source Code No Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]
Open Datasets No The paper is theoretical and does not report experimental results. Therefore, it does not use datasets for training. The ethics checklist states 'N/A' for experimental questions.
Dataset Splits No The paper is theoretical and does not report experimental results. Therefore, it does not specify dataset splits. The ethics checklist states 'N/A' for experimental questions.
Hardware Specification No The paper is theoretical and does not report experimental results. Therefore, it does not specify hardware used. The ethics checklist states 'N/A' for experimental questions.
Software Dependencies No The paper is theoretical and does not report experimental results. Therefore, it does not list software dependencies with version numbers. The ethics checklist states 'N/A' for experimental questions.
Experiment Setup No The paper is theoretical and does not report experimental results. Therefore, it does not describe experimental setup details like hyperparameters or training settings. The ethics checklist states 'N/A' for experimental questions.