Adding One Neuron Can Eliminate All Bad Local Minima

Authors: SHIYU LIANG, Ruoyu Sun, Jason D. Lee, R. Srikant

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Under mild assumptions, we prove that after adding one special neuron with a skip connection to the output, or one special neuron per layer, every local minimum is a global minimum.
Researcher Affiliation Academia Shiyu Liang Coordinated Science Laboratory Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign sliang26@illinois.edu Ruoyu Sun Coordinated Science Laboratory Department of ISE University of Illinois at Urbana-Champaign ruoyus@illinois.edu Jason D. Lee Marshall School of Business University of Southern California jasonlee@marshall.usc.edu R. Srikant Coordinated Science Laboratory Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign rsrikant@illinois.edu
Pseudocode No The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm'.
Open Source Code No No statement about releasing source code or a link to a code repository for the methodology described in this paper was found.
Open Datasets No The paper discusses using a 'dataset D' for binary classification tasks and makes an assumption about its realizability, but it does not specify any named public dataset or provide concrete access information (link, DOI, repository, or formal citation with authors/year) for any dataset used or mentioned.
Dataset Splits No This paper is theoretical and does not describe empirical experiments, thus no information on training, validation, or test dataset splits is provided.
Hardware Specification No This paper is theoretical and does not describe empirical experiments, therefore no specific hardware specifications are mentioned.
Software Dependencies No This paper is theoretical and does not describe empirical experiments, therefore no specific software dependencies with version numbers are mentioned.
Experiment Setup No This paper is theoretical and focuses on mathematical proofs, and thus does not include details on experimental setup or hyperparameters.