Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Analysis of one-hidden-layer neural networks via the resolvent method
Authors: Vanessa Piccolo, Dominik Schröder
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We prove that the Stieltjes transform of the limiting spectral distribution approximately satisfies a quartic self-consistent equation, which is exactly the equation obtained by Pennington and Worah [22] and Benigni and Péché [6] with the moment method. We extend the previous results to the case of additive bias Y = f(WX + B) with B being an independent rank-one Gaussian random matrix, closer modelling the neural network infrastructures encountered in practice. Our key finding is that in the case of additive bias it is impossible to choose an activation function preserving the layer-to-layer singular value distribution, in sharp contrast to the bias-free case where a simple integral constraint is sufficient to achieve isospectrality. The numerical experiments were conducted for the parameters n1 = 3000, ϕ = σx = σw = 1, ψ = 5 (left) or ψ = 2 (right), and σb = 0 (top) or σb = 0.25 (bottom). In Fig. 2 we test this result experimentally and choose the activation function f(x) = c1|x| c2 with c1, c2 such that (2) is satisfied and θ1(f) = 1. We find that in the bias-free case (left), irrespective of the network depth, the eigenvalues of the covariance matrix Y (l)(Y (l)) converge to their theoretical limit from Theorem 2.1, exactly as in [22, Fig. 1]2. |
| Researcher Affiliation | Academia | Vanessa Piccolo ETH Zurich (current affiliation: ENS Lyon) EMAIL Dominik Schröder Institute for Theoretical Studies ETH Zurich EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper specifies the use of 'random data matrix X' and 'random weight matrix W' with 'i.i.d. random variables' but does not mention or provide access information for any publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) needed to reproduce data partitioning. |
| Hardware Specification | No | The paper mentions that 'numerical experiments were conducted' but provides no specific hardware details (e.g., GPU/CPU models, processor types, or memory amounts) used for running these experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiments. |
| Experiment Setup | Yes | The numerical experiments were conducted for the parameters n1 = 3000, ϕ = σx = σw = 1, ψ = 5 (left) or ψ = 2 (right), and σb = 0 (top) or σb = 0.25 (bottom). |