Adding One Neuron Can Eliminate All Bad Local Minima
Authors: SHIYU LIANG, Ruoyu Sun, Jason D. Lee, R. Srikant
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Under mild assumptions, we prove that after adding one special neuron with a skip connection to the output, or one special neuron per layer, every local minimum is a global minimum. |
| Researcher Affiliation | Academia | Shiyu Liang Coordinated Science Laboratory Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign sliang26@illinois.edu Ruoyu Sun Coordinated Science Laboratory Department of ISE University of Illinois at Urbana-Champaign ruoyus@illinois.edu Jason D. Lee Marshall School of Business University of Southern California jasonlee@marshall.usc.edu R. Srikant Coordinated Science Laboratory Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign rsrikant@illinois.edu |
| Pseudocode | No | The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | No | No statement about releasing source code or a link to a code repository for the methodology described in this paper was found. |
| Open Datasets | No | The paper discusses using a 'dataset D' for binary classification tasks and makes an assumption about its realizability, but it does not specify any named public dataset or provide concrete access information (link, DOI, repository, or formal citation with authors/year) for any dataset used or mentioned. |
| Dataset Splits | No | This paper is theoretical and does not describe empirical experiments, thus no information on training, validation, or test dataset splits is provided. |
| Hardware Specification | No | This paper is theoretical and does not describe empirical experiments, therefore no specific hardware specifications are mentioned. |
| Software Dependencies | No | This paper is theoretical and does not describe empirical experiments, therefore no specific software dependencies with version numbers are mentioned. |
| Experiment Setup | No | This paper is theoretical and focuses on mathematical proofs, and thus does not include details on experimental setup or hyperparameters. |