Nonlinear random matrix theory for deep learning
Authors: Jeffrey Pennington, Pratik Worah
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply these results to the computation of the asymptotic performance of single-layer random feature networks on a memorization task and to the analysis of the eigenvalues of the data covariance matrix as it propagates through a neural network. As a byproduct of our analysis, we identify an intriguing new class of activation functions with favorable properties. |
| Researcher Affiliation | Industry | Jeffrey Pennington Google Brain jpennin@google.com Pratik Worah Google Research pworah@google.com |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., repository links or explicit statements) for open-source code related to the described methodology. |
| Open Datasets | No | The paper states that for the memorization task, 'we take the data X and the targets Y to be independent Gaussian random matrices' and uses 'random input-output pairs', but does not refer to a specific, publicly available named dataset with access information. |
| Dataset Splits | No | The paper discusses 'numerical simulations' and comparing theoretical predictions to these simulations, but it does not specify any training, validation, or test dataset splits. The 'data' appears to be generated random matrices. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | No | The paper describes theoretical derivations and numerical comparisons, but it does not provide specific experimental setup details such as hyperparameters, optimizer settings, or training configurations for a deep learning model. |