Large-width functional asymptotics for deep Gaussian neural networks

Authors: Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we consider fully-connected feed-forward deep neural networks where weights and biases are independent and identically distributed according to Gaussian distributions. Extending previous results (Matthews et al., 2018a;b; Yang, 2019) we adopt a function-space perspective, i.e. we look at neural networks as infinite-dimensional random elements on the input space RI. Under suitable assumptions on the activation function we show that: i) a network defines a continuous stochastic process on the input space RI; ii) a network with re-scaled weights converges weakly to a continuous Gaussian Process in the large-width limit; iii) the limiting Gaussian Process has almost surely locally γ-Hölder continuous paths, for 0 < γ < 1. Our results contribute to recent theoretical studies on the interplay between infinitely-wide deep neural networks and Gaussian Processes by establishing weak convergence in function-space with respect to a stronger metric.
Researcher Affiliation Collaboration Daniele Bracale1, Stefano Favaro1,2, Sandra Fortini3, Stefano Peluchetti4 1 University of Torino, 2 Collegio Carlo Alberto, 3 Bocconi University, 4 Cogent Labs
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks. It focuses on mathematical proofs and theoretical derivations.
Open Source Code No The paper does not mention or provide access to any open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not involve training on datasets. No information about publicly available datasets or their access is provided.
Dataset Splits No The paper is theoretical and does not involve experimental validation with dataset splits. Therefore, no information about training, validation, or test splits is provided.
Hardware Specification No The paper is theoretical and does not involve running experiments, therefore no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not involve practical implementation or experimentation, so no software dependencies with version numbers are listed.
Experiment Setup No The paper is theoretical and does not describe an experimental setup, hyperparameters, or training configurations.