Nonlinear random matrix theory for deep learning

Authors: Jeffrey Pennington, Pratik Worah

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply these results to the computation of the asymptotic performance of single-layer random feature networks on a memorization task and to the analysis of the eigenvalues of the data covariance matrix as it propagates through a neural network. As a byproduct of our analysis, we identify an intriguing new class of activation functions with favorable properties.
Researcher Affiliation Industry Jeffrey Pennington Google Brain jpennin@google.com Pratik Worah Google Research pworah@google.com
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access information (e.g., repository links or explicit statements) for open-source code related to the described methodology.
Open Datasets No The paper states that for the memorization task, 'we take the data X and the targets Y to be independent Gaussian random matrices' and uses 'random input-output pairs', but does not refer to a specific, publicly available named dataset with access information.
Dataset Splits No The paper discusses 'numerical simulations' and comparing theoretical predictions to these simulations, but it does not specify any training, validation, or test dataset splits. The 'data' appears to be generated random matrices.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup No The paper describes theoretical derivations and numerical comparisons, but it does not provide specific experimental setup details such as hyperparameters, optimizer settings, or training configurations for a deep learning model.