Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Breaking the Curse of Dimensionality with Convex Neural Networks

Authors: Francis Bach

JMLR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We provide a detailed theoretical analysis of their generalization performance, with a study of both the approximation and the estimation errors. We show in particular that they are adaptive to unknown underlying linear structures, such as the dependence on the projection of the input variables onto a low-dimensional subspace. Moreover, when using sparsity-inducing norms on the input weights, we show that high-dimensional non-linear variable selection may be achieved, without any strong assumption regarding the data and with a total number of variables potentially exponential in the number of observations. However, solving this convex optimization problem in infinite dimensions is only possible if the non-convex subproblem of addition of a new unit can be solved efficiently. We provide a simple geometric interpretation for our choice of activation functions and describe simple conditions for convex relaxations of the finite-dimensional non-convex subproblem to achieve the same generalization error bounds, even when constant-factor approximations cannot be found. We were not able to find strong enough convex relaxations to obtain provably polynomialtime algorithms and leave open the existence or non-existence of such tractable algorithms with non-exponential sample complexities.
Researcher Affiliation Academia Francis Bach EMAIL INRIA Sierra Project-team D epartement d Informatique de l Ecole Normale Sup erieure (UMR CNRS/ENS/INRIA) 2, rue Simone Iff 75012 Paris, France
Pseudocode Yes The conditional gradient algorithm (a.k.a. Frank-Wolfe algorithm) is an iterative algorithm, starting from any function f0 Fδ 1 and with the following recursion, for t 0: ft arg min f Fδ 1 f, J (ft) L2(dρ) ft+1 = (1 ρt)ft + ρt ft. See an illustration in Figure 1.
Open Source Code No The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not conduct experiments on specific datasets. Therefore, it does not provide access information for any open datasets.
Dataset Splits No The paper is theoretical and does not describe experimental work using datasets, thus it does not provide information about dataset splits.
Hardware Specification No The paper is theoretical and focuses on mathematical analysis and proofs; it does not describe any experimental setup or specific hardware used for computations.
Software Dependencies No The paper is theoretical and does not specify any software dependencies with version numbers for experimental reproducibility.
Experiment Setup No The paper is theoretical and does not include an experimental setup, hyperparameters, or training configurations.