Understanding Weight Normalized Deep Neural Networks with Rectified Linear Units

Authors: Yixi Xu, Xiao Wang

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper presents a general framework for norm-based capacity control for Lp,q weight normalized deep neural networks. We establish the upper bound on the Rademacher complexities of this family. With an Lp,q normalization where q p and 1/p+1/p = 1, we discuss properties of a width-independent capacity control, which only depends on the depth by a square root term. We further analyze the approximation properties of Lp,q weight normalized deep neural networks. In particular, for an L1, weight normalized network, the approximation error can be controlled by the L1 norm of the output layer, and the corresponding generalization error only depends on the architecture by the square root of the depth.
Researcher Affiliation Academia Yixi Xu Department of Statistics Purdue University West Lafayette, IN 47907 xu573@purdue.edu Xiao Wang Department of Statistics Purdue University West Lafayette, IN 47907 wangxiao@purdue.edu
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code No The paper does not provide any statement or link regarding open-source code for the described methodology.
Open Datasets No The paper does not mention using any datasets for training or evaluation, nor does it provide any concrete access information for a publicly available dataset.
Dataset Splits No The paper does not describe any dataset splits for training, validation, or testing, as it focuses on theoretical analysis rather than empirical experimentation.
Hardware Specification No The paper does not specify any hardware used for running experiments, as it focuses on theoretical analysis.
Software Dependencies No The paper does not specify any software dependencies with version numbers, as it focuses on theoretical analysis rather than implementation details.
Experiment Setup No The paper does not provide details about an experimental setup, hyperparameters, or training configurations, as it focuses on theoretical analysis.