On the Quality of the Initial Basin in Overspecified Neural Networks
Authors: Itay Safran, Ohad Shamir
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | However, a theoretical explanation for this remains a major open problem, since training neural networks involves optimizing a highly non-convex objective function, and is known to be computationally hard in the worst case. In this work, we study the geometric structure of the associated non-convex objective function, in the context of Re LU networks and starting from a random initialization of the network parameters. Before continuing, we emphasize that our observations are purely geometric in nature, independent of any particular optimization procedure. |
| Researcher Affiliation | Academia | Itay Safran ITAY.SAFRAN@WEIZMANN.AC.IL Ohad Shamir OHAD.SHAMIR@WEIZMANN.AC.IL Weizmann Institute of Science, Israel |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information for source code. |
| Open Datasets | No | The paper refers to 'data' or 'training data' using generic symbols like S = (xt, yt)m t=1, but does not provide any concrete access information, such as specific names of public datasets, links, DOIs, or formal citations. |
| Dataset Splits | No | The paper does not provide specific dataset split information for training, validation, or testing. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used to run experiments, as it is a theoretical paper. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | No | The paper does not contain specific experimental setup details such as hyperparameter values or training configurations. |