Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Non-attracting Regions of Local Minima in Deep and Wide Neural Networks

Authors: Henning Petzka, Cristian Sminchisescu

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate the construction of a suboptimal local minimum in a deep and extremely wide neural network as given in the proof of Theorem 2.3 We start by considering a three-layer network of size 2 5 5 1, i.e., we have two input dimensions, one output dimension and hidden layers of five neurons. We use its network function f to create a data set of 20 samples (xα, f(xα)), hence we know that a network of size 2 5 5 1 can attain zero loss. We initialize a new neural network of size 2 1 1 1 and train it until convergence to find a local minimum of total loss (sum over the 20 data points) of 4.817. We check for positive definiteness of the matrix B1 i,j(1) (eigenvalues here given by 0.0182, 0.0004) and positivity of B1 1,1(2) (here 0.0757). Following the proof to Theorem 2, we add twenty neurons to both hidden layers to construct a local minimum in an extremely wide network of size 2 21 21 1. The local minimum must be suboptimal by construction of the data set. Experimentally, we show not only that indeed we end up with a suboptimal minimum, but also that it belongs to a non-attracting region of local minima. Figure 3 shows results of this construction.
Researcher Affiliation Collaboration Henning Petzka EMAIL Lund University Cristian Sminchisescu EMAIL Google Research Lund University
Pseudocode No The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks. The methods are described using mathematical equations and prose.
Open Source Code Yes The accompanying code can be found at https://github.com/petzkahe/nonattracting_regions_of_ local_minima.git.
Open Datasets No We start by considering a three-layer network of size 2 5 5 1, i.e., we have two input dimensions, one output dimension and hidden layers of five neurons. We use its network function f to create a data set of 20 samples (xα, f(xα)), hence we know that a network of size 2 5 5 1 can attain zero loss.
Dataset Splits No We use its network function f to create a data set of 20 samples (xα, f(xα))... We initialize a new neural network of size 2 1 1 1 and train it until convergence to find a local minimum of total loss (sum over the 20 data points) of 4.817. The paper does not specify any training/test/validation splits for this custom-generated dataset.
Hardware Specification No The paper mentions that an experiment was performed to empirically validate the construction, but it does not specify any hardware details like GPU/CPU models, processors, or memory specifications used for running the experiments.
Software Dependencies No The paper describes an empirical validation but does not provide specific software dependencies (e.g., library names with version numbers) that would be needed to replicate the experiment.
Experiment Setup No We initialize a new neural network of size 2 1 1 1 and train it until convergence to find a local minimum of total loss (sum over the 20 data points) of 4.817. ... Following the proof to Theorem 2, we add twenty neurons to both hidden layers to construct a local minimum in an extremely wide network of size 2 21 21 1. This text specifies network architectures and final loss but lacks specific hyperparameters such as learning rate, optimizer, or number of epochs needed for training a neural network.