Dendritic cortical microcircuits approximate the backpropagation algorithm

Authors: João Sacramento, Rui Ponte Costa, Yoshua Bengio, Walter Senn

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the learning capabilities of the model in regression and classification tasks, and show analytically that it approximates the error backpropagation algorithm. Moreover, our framework is consistent with recent observations of learning between brain areas and the architecture of cortical microcircuits. Finally, we empirically evaluate the performance of the model on nonlinear regression and recognition tasks.
Researcher Affiliation Academia João Sacramento Department of Physiology University of Bern, Switzerland sacramento@pyl.unibe.ch Rui Ponte Costa Department of Physiology University of Bern, Switzerland costa@pyl.unibe.ch Yoshua Bengio Mila and Université de Montréal, Canada yoshua.bengio@mila.quebec Walter Senn Department of Physiology University of Bern, Switzerland senn@pyl.unibe.ch
Pseudocode No The paper includes mathematical equations for the model but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about releasing source code or provide links to a code repository.
Open Datasets Yes Next, we turn to the problem of classifying MNIST handwritten digits. We train the models on the standard MNIST handwritten image database, further splitting the training set into 55000 training and 5000 validation examples.
Dataset Splits Yes We train the models on the standard MNIST handwritten image database, further splitting the training set into 55000 training and 5000 validation examples.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python version, library versions).
Experiment Setup Yes Learning rates were manually chosen to yield best performance. Error curves are exponential moving averages of the sum of squared errors loss kr P 2 k2 computed after every example on unseen input patterns. Test error performance is measured in a noise-free setting (σ = 0). Plasticity induction terms given by Eqs. 7-9 are low-pass filtered with time constant w before being definitely consolidated, to dampen fluctuations; synaptic plasticity is kept on throughout. To speed-up training we use a mini-batch strategy on every learning rule, whereby weight changes are averaged across 10 images before being applied. We take the neuronal transfer function φ to be a logistic function, φ(u) = 1/(1 + exp( u)) and include a learnable threshold on each neuron, modelled as an additional input fixed at unity with a plastic weight.