Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Distribution-Specific Hardness of Learning Neural Networks

Authors: Ohad Shamir

JMLR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Although neural networks are routinely and successfully trained in practice using simple gradient-based methods, most existing theoretical results are negative, showing that learning such networks is difficult, in a worst-case sense over all data distributions. In this paper, we take a more nuanced view, and consider whether specific assumptions on the niceness of the input distribution, or niceness of the target function (e.g. in terms of smoothness, non-degeneracy, incoherence, random choice of parameters etc.), are sufficient to guarantee learnability using gradient-based methods. We provide evidence that neither class of assumptions alone is sufficient: On the one hand, for any member of a class of nice target functions, there are difficult input distributions. On the other hand, we identify a family of simple target functions, which are difficult to learn even if the input distribution is nice . To prove our results, we develop some tools which may be of independent interest, such as extending Fourier-based hardness techniques developed in the context of statistical queries (Blum et al., 1994), from the Boolean cube to Euclidean space and to more general classes of functions.
Researcher Affiliation Academia Ohad Shamir EMAIL Weizmann Institute of Science, Rehovot, Israel
Pseudocode No The paper primarily presents mathematical proofs, theorems, and discussions. It does not include any sections explicitly labeled 'Pseudocode' or 'Algorithm', nor does it contain any structured, code-like procedural descriptions.
Open Source Code No The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide links to any code repositories.
Open Datasets No This is a theoretical paper focusing on mathematical proofs and analyses of learning hardness. It discusses general input distributions like 'Gaussians and mixtures of Gaussians' for theoretical examples, but it does not describe or utilize specific named datasets in an experimental context. Therefore, no information on public dataset availability is provided.
Dataset Splits No As a theoretical paper, there are no empirical experiments conducted with specific datasets, and therefore no information regarding training, validation, or test dataset splits is provided.
Hardware Specification No The paper is theoretical and focuses on mathematical proofs and analyses of learning hardness. It does not describe any experiments that would require specific hardware, thus no hardware specifications are mentioned.
Software Dependencies No This paper is purely theoretical, presenting mathematical proofs and analyses. It does not describe any computational experiments or implementations that would require specific software dependencies with version numbers.
Experiment Setup No As a theoretical paper, no empirical experiments are described, and consequently, no experimental setup details such as hyperparameters or training configurations are provided.