Learning better with Dale’s Law: A Spectral Perspective

Authors: Pingsheng Li, Jonathan Cornford, Arna Ghosh, Blake Richards

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We tested the three different types of networks (standard RNN, Col EI, and DANN) on three classical tasks for RNNs: the adding problem [27], sequential MNIST classification [28], and language modelling using the Penn Tree Bank [29] All experiments were run with Py Torch version 1.5.0 on a RTX 8000 GPU cluster.
Researcher Affiliation Academia Pingsheng Li* Mc Gill University Mila Quebec AI Institute pingsheng.li@mail.mcgill.ca Jonathan Cornford* Mc Gill University Mila Quebec AI Institute cornforj@mila.quebec Arna Ghosh Mc Gill University Mila Quebec AI Institute ghosharn@mila.quebec Blake Richards Mc Gill University Montreal Neurological Institute Mila Quebec AI Institute & CIFAR blake.richards@mila.quebec
Pseudocode No The paper describes model definitions and experimental procedures but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The reader can find the code for all our experiments available here.
Open Datasets Yes We tested the three different types of networks (standard RNN, Col EI, and DANN) on three classical tasks for RNNs: the adding problem [27], sequential MNIST classification [28], and language modelling using the Penn Tree Bank [29] All experiments were run with Py Torch version 1.5.0 on a RTX 8000 GPU cluster.
Dataset Splits No The paper mentions standard datasets like sequential MNIST and Penn Tree Bank but does not explicitly provide specific details on training, validation, or test data splits (e.g., percentages or sample counts).
Hardware Specification Yes All experiments were run with Py Torch version 1.5.0 on a RTX 8000 GPU cluster.
Software Dependencies Yes All experiments were run with Py Torch version 1.5.0 on a RTX 8000 GPU cluster.
Experiment Setup Yes Col EI recurrent weights are initialised such that the greatest norm of eigenvalues, i.e. the spectral radius , is 1.5 [9]. In contrast, the initialisation of DANNs & standard RNNs results in ' 1/ 3 (Py Torch default RNN initialisation). Therefore, we assessed model performance across a range of initial values of (Appendix 9). Again, we found that Col EI networks learned poorly, and there was no value of for which Col EI networks matched the performance of the best standard RNNs and DANNs. Figure 2 A-D captions: (A) Adding problem (1 layer of 10 neurons). (B) Sequential MNIST (3 layers of 100 neurons). (C) Penn Tree Bank (3 layers of 500 neurons). (D) Naturalistic Object Recognition (1 layer of 1000 neurons).