Wide Neural Networks as Gaussian Processes: Lessons from Deep Equilibrium Models

Authors: Tianxiang Gao, Xiaokai Huo, Hailiang Liu, Hongyang Gao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present a series of numerical experiments to validate the theoretical results we have established. Our experiments aim to verify the well-posedness of the fixed point of the transition equation (2). We also investigate whether the DEQ behaves as a Gaussian process when the width is sufficiently large, as stated in our main result, Theorem 4.4. Additionally, we examine the strictly positive definiteness of the limiting covariance function Σ, as established in Theorem 4.5, by computing the smallest eigenvalue of the associated covariance matrix K. These experiments serve to empirically support our theoretical findings.
Researcher Affiliation Academia Tianxiang Gao Iowa State University gaotx@iastate.edu Xiaokai Huo Iowa State University xhuo@iastate.edu Hailiang Liu Iowa State University hliu@iastate.edu Hongyang Gao Iowa State University hygao@iastate.edu
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code is made at https://github.com/deqg/deq.git.
Open Datasets Yes Test accuracy of the MNIST dataset using NNGP and DEQs with various widths; MSE of the MNIST dataset using NNGP and DEQs with various widths.
Dataset Splits No The paper mentions the use of the MNIST dataset and evaluation on a test set, but it does not specify the training, validation, and test splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper provides a link to a GitHub repository for the code but does not list specific software dependencies with their version numbers within the text.
Experiment Setup Yes To demonstrate this, we consider a specific DEQ with nin = 10 and nout = 10, activated by tanh. We analyze the output distributions of 10,000 neural networks. [...] In the third plot of Figure 1, we present the histogram for a width of 1000. [...] Furthermore, in Figure 5 in the supplementary material, we provide histograms for widths of 10, 50, 100, 500, and 1000.