Wide Neural Networks as Gaussian Processes: Lessons from Deep Equilibrium Models
Authors: Tianxiang Gao, Xiaokai Huo, Hailiang Liu, Hongyang Gao
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present a series of numerical experiments to validate the theoretical results we have established. Our experiments aim to verify the well-posedness of the fixed point of the transition equation (2). We also investigate whether the DEQ behaves as a Gaussian process when the width is sufficiently large, as stated in our main result, Theorem 4.4. Additionally, we examine the strictly positive definiteness of the limiting covariance function Σ, as established in Theorem 4.5, by computing the smallest eigenvalue of the associated covariance matrix K. These experiments serve to empirically support our theoretical findings. |
| Researcher Affiliation | Academia | Tianxiang Gao Iowa State University gaotx@iastate.edu Xiaokai Huo Iowa State University xhuo@iastate.edu Hailiang Liu Iowa State University hliu@iastate.edu Hongyang Gao Iowa State University hygao@iastate.edu |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is made at https://github.com/deqg/deq.git. |
| Open Datasets | Yes | Test accuracy of the MNIST dataset using NNGP and DEQs with various widths; MSE of the MNIST dataset using NNGP and DEQs with various widths. |
| Dataset Splits | No | The paper mentions the use of the MNIST dataset and evaluation on a test set, but it does not specify the training, validation, and test splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper provides a link to a GitHub repository for the code but does not list specific software dependencies with their version numbers within the text. |
| Experiment Setup | Yes | To demonstrate this, we consider a specific DEQ with nin = 10 and nout = 10, activated by tanh. We analyze the output distributions of 10,000 neural networks. [...] In the third plot of Figure 1, we present the histogram for a width of 1000. [...] Furthermore, in Figure 5 in the supplementary material, we provide histograms for widths of 10, 50, 100, 500, and 1000. |