Why Deep Neural Networks for Function Approximation?
Authors: Shiyu Liang, R. Srikant
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We show that, for a large class of piecewise smooth functions, the number of neurons needed by a shallow network to approximate a function is exponentially larger than the corresponding number of neurons needed by a deep network for a given degree of function approximation. First, we consider univariate functions on a bounded interval and require a neural network to achieve an approximation error of ε uniformly over the interval. We show that shallow networks (i.e., networks whose depth does not depend on ε) require Ω(poly(1/ε)) neurons while deep networks (i.e., networks whose depth grows with 1/ε) require O(polylog(1/ε)) neurons. We then extend these results to certain classes of important multivariate functions. Our results are derived for neural networks which use a combination of rectifier linear units (Re LUs) and binary step units, two of the most popular type of activation functions. Our analysis builds on a simple observation: the multiplication of two bits can be represented by a Re LU. |
| Researcher Affiliation | Academia | Shiyu Liang & R. Srikant Coordinated Science Laboratory and Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Urbana, IL 61801, USA |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not conduct empirical studies using datasets, therefore, no training datasets are mentioned. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical validation on datasets, thus no dataset split information is provided. |
| Hardware Specification | No | The paper is theoretical and does not involve empirical experiments, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not involve empirical experiments, therefore no software dependencies with version numbers are mentioned. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setup or training configurations. |