Deep ReLU Networks Preserve Expected Length

Authors: Boris Hanin, Ryan Jeong, David Rolnick

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental These theoretical results are corroborated by our experiments. and We empirically verify that our theoretical results accurately predict observed behavior for networks at initialization, while previous bounds are loose and fail to capture subtle architecture dependencies.
Researcher Affiliation Academia Boris Hanin Dept. of Operations Research & Financial Engineering Princeton University Princeton, NJ 08544 USA bhanin@princeton.edu Ryan Jeong Dept. of Mathematics University of Pennsylvania Philadelphia, PA 19104 USA rsjeong@sas.upenn.edu David Rolnick School of Computer Science Mc Gill University Montr eal, QC H3A 0G4 Canada drolnick@cs.mcgill.ca
Pseudocode No Explanation: The paper provides detailed mathematical proofs and explanations of its theory but does not include any structured pseudocode or algorithm blocks.
Open Source Code No Explanation: The paper does not contain any statements about releasing code, nor does it provide a link to a code repository for the methodology described.
Open Datasets No Explanation: The paper describes the generation of input 'line segments' for experiments but does not use a publicly available or open dataset, nor does it provide concrete access information for any dataset used.
Dataset Splits No Explanation: The paper does not specify explicit train/validation/test dataset splits. Experiments involve randomly initialized networks and different initializations.
Hardware Specification No Explanation: The paper does not provide any specific hardware details such as GPU or CPU models, memory specifications, or cloud computing instance types used for running the experiments.
Software Dependencies No Explanation: The paper does not list specific software dependencies with version numbers (e.g., library names with versions) that would be needed to replicate the experiments.
Experiment Setup Yes For all experiments, weights were initialized from i.i.d. normal distributions with variance 2/fan-in and bias variance 0.1. and Length distortion is calculated for 500 different initializations of the weights and biases of the network (the weight variance is 2/fan-in). and We first approximate the intersections of the line segment with linear region boundaries using a binary search subroutine. Specifically, initialize the set S = [0, 0.5, 1], which will contain the parameter values for these approximations.