Large-width functional asymptotics for deep Gaussian neural networks
Authors: Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we consider fully-connected feed-forward deep neural networks where weights and biases are independent and identically distributed according to Gaussian distributions. Extending previous results (Matthews et al., 2018a;b; Yang, 2019) we adopt a function-space perspective, i.e. we look at neural networks as infinite-dimensional random elements on the input space RI. Under suitable assumptions on the activation function we show that: i) a network defines a continuous stochastic process on the input space RI; ii) a network with re-scaled weights converges weakly to a continuous Gaussian Process in the large-width limit; iii) the limiting Gaussian Process has almost surely locally γ-Hölder continuous paths, for 0 < γ < 1. Our results contribute to recent theoretical studies on the interplay between infinitely-wide deep neural networks and Gaussian Processes by establishing weak convergence in function-space with respect to a stronger metric. |
| Researcher Affiliation | Collaboration | Daniele Bracale1, Stefano Favaro1,2, Sandra Fortini3, Stefano Peluchetti4 1 University of Torino, 2 Collegio Carlo Alberto, 3 Bocconi University, 4 Cogent Labs |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. It focuses on mathematical proofs and theoretical derivations. |
| Open Source Code | No | The paper does not mention or provide access to any open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not involve training on datasets. No information about publicly available datasets or their access is provided. |
| Dataset Splits | No | The paper is theoretical and does not involve experimental validation with dataset splits. Therefore, no information about training, validation, or test splits is provided. |
| Hardware Specification | No | The paper is theoretical and does not involve running experiments, therefore no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not involve practical implementation or experimentation, so no software dependencies with version numbers are listed. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup, hyperparameters, or training configurations. |