reproducibilityindex.ai

Differentiable plasticity: training plastic neural networks with backpropagation

Authors: Thomas Miconi, Kenneth Stanley, Jeff Clune

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that plasticity, just like connection weights, can be optimized by gradient descent in large (millions of parameters) recurrent networks with Hebbian plastic connections. First, recurrent plastic networks with more than two million parameters can be trained to memorize and reconstruct sets of novel, high-dimensional (1,000+ pixels) natural images not seen during training. Crucially, traditional non-plastic recurrent networks fail to solve this task. Furthermore, trained plastic networks can also solve generic meta-learning tasks such as the Omniglot task, with competitive results and little parameter overhead. Finally, in reinforcement learning settings, plastic networks outperform a non-plastic equivalent in a maze exploration task.
Researcher Affiliation	Industry	Uber AI Labs. Correspondence to: Thomas Miconi <tmiconi@uber.com>.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks in the main text.
Open Source Code	Yes	The code for all experiments described in this paper is available at https://github.com/ uber-common/differentiable-plasticity
Open Datasets	Yes	Images are from the CIFAR-10 database, which contains 60,000 images of size 32 by 32 pixels (i.e. 1,024 pixels in total), converted to grayscale pixels between 0 and 1.0.
Dataset Splits	No	The paper mentions training and test sets (e.g., '1,523 classes for training and 100 classes... for testing' for Omniglot) but does not provide specific train/validation/test splits with percentages or counts for all experiments.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models or processor types used for running experiments.
Software Dependencies	No	All experiments reported here use the Py Torch package to compute gradients. However, no specific version number for PyTorch or other software dependencies is provided.
Experiment Setup	Yes	The gradient of this error over the wi,j and αi,j coefﬁcients is then computed by backpropagation, and these coefﬁcients are optimized through an Adam solver (Kingma & Ba, 2015) with learning rate 0.001.