The Multilinear Structure of ReLU Networks
Authors: Thomas Laurent, James Brecht
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the loss surface of neural networks equipped with a hinge loss criterion and Re LU or leaky Re LU nonlinearities. Any such network defines a piecewise multilinear form in parameter space. By appealing to harmonic analysis we show that all local minima of such network are non-differentiable, except for those minima that occur in a region of parameter space where the loss surface is perfectly flat. Non-differentiable minima are therefore not technicalities or pathologies; they are heart of the problem when investigating the loss of Re LU networks. As a consequence, we must employ techniques from nonsmooth analysis to study these loss surfaces. We show how to apply these techniques in some illustrative cases. |
| Researcher Affiliation | Academia | 1Department of Mathematics, Loyola Marymount University, Los Angeles, CA 90045, USA 2Department of Mathematics and Statistics, California State University, Long Beach, Long Beach, CA 90840, USA. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link for open-source code for the methodology. |
| Open Datasets | No | The paper is theoretical and discusses properties of neural networks and data generally (e.g., 'linearly separable data') but does not specify or provide access information for any publicly available datasets used for empirical training. |
| Dataset Splits | No | The paper is theoretical and does not describe training, validation, or test dataset splits, as it does not conduct empirical experiments. |
| Hardware Specification | No | The paper is theoretical and does not describe any specific hardware used for running experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify any software dependencies with version numbers for experimental reproducibility. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup details such as hyperparameters or system-level training settings. |