Credit Assignment in Neural Networks through Deep Feedback Control
Authors: Alexander Meulemans, Matilde Tristany Farinha, Javier Garcia Ordonez, Pau Vilimelis Aceituno, João Sacramento, Benjamin F. Grewe
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we provide detailed experimental results, corroborating our theoretical contributions and showing that DFC does principled CA on standard computer-vision benchmarks in a way that fundamentally differs from standard BP. We evaluate DFC in detail on toy experiments to showcase that our theoretical results translate to practice (Section 6.1) and on a modest range of computer vision benchmarks MNIST classification and autoencoding [40], and Fashion MNIST classification [41] to show that DFC can do precise CA in more challenging settings (Section 6.2). |
| Researcher Affiliation | Academia | Institute of Neuroinformatics, University of Zürich and ETH Zürich ameulema@ethz.ch |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | PyTorch implementation of all methods is available at https://github.com/meulemansalex/deep_feedback_control. |
| Open Datasets | Yes | We evaluate DFC in detail on toy experiments to showcase that our theoretical results translate to practice (Section 6.1) and on a modest range of computer vision benchmarks MNIST classification and autoencoding [40], and Fashion MNIST classification [41] to show that DFC can do precise CA in more challenging settings (Section 6.2). |
| Dataset Splits | Yes | Test errors (classification) and test loss (autoencoder) corresponding to the epoch with the best validation result (for 5000 validation samples) over a training of 100 epochs (classification) or 25 epochs (autoencoder). |
| Hardware Specification | No | The paper discusses the potential for DFC implementation on analog hardware in the future but does not provide any specific details (e.g., GPU/CPU models, memory amounts) about the hardware used to run the experiments reported in the paper. |
| Software Dependencies | No | The paper mentions 'PyTorch implementation' and 'Adam optimizer' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | Architectures: 3x256 fully connected (FC) tanh hidden layers and softmax output (classification), 256-32-256 FC hidden layers for autoencoder MNIST with tanh-linear-tanh nonlinearities, and a linear output. Results for nonlinear student-teacher regression task with layer sizes (15-10-10-5), tanh nonlinearities, a linear output layer, kp = 1.5, λ = 0.05, and α = 0.0015. |