Learning to Learn with Feedback and Local Plasticity
Authors: Jack Lindsey, Ashok Litwin-Kumar
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that meta-trained networks effectively use feedback connections to perform online credit assignment in multi-layer architectures. Surprisingly, this approach matches or exceeds a state-of-the-art gradient-based online meta-learning algorithm on regression and classification tasks, excelling in particular at continual learning. Analysis of the weight updates employed by these models reveals that they differ qualitatively from gradient descent in a way that reduces interference between updates. Our results support the view that biologically plausible learning mechanisms may not only match gradient descent-based learning, but also overcome its limitations. |
| Researcher Affiliation | Academia | Jack Lindsey, Ashok Litwin-Kumar Columbia University, Department of Neuroscience {j.lindsey, a.litwin-kumar}@columbia.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. Procedures are described in narrative text. |
| Open Source Code | Yes | Source code for our experiments is available at github.com/jlindsey15/Feedback And Local Plasticity |
| Open Datasets | Yes | We consider the Omniglot [17] and Mini-Imagenet [33] datasets. |
| Dataset Splits | Yes | Meta-training is performed for 20,000 episodes. ... Meta-training is performed for 40,000 episodes. ... In each case, the dataset is split into meta-training and meta-testing classes. During an episode, k examples from each of N classes are presented. In the i.i.d. version of the task, they are presented in random order, while in the continual learning version, all k examples from one class are presented before proceeding to the next. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using "Adam optimizer [15]" but does not specify version numbers for any software dependencies or libraries used in the implementation. |
| Experiment Setup | Yes | We used the Adam optimizer [15] for meta-optimization. Additional implementation details can be found in Appendix B. ... We use a nine-layer fully connected network for regression tasks, and a network with six convolutional layers + two fully connected layers for classification tasks. ... Meta-training is performed for 20,000 episodes. ... Meta-training is performed for 40,000 episodes. |