Biologically Motivated Algorithms for Propagating Local Target Representations
Authors: Alexander G. Ororbia, Ankur Mali4651-4658
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare our procedures to several other biologically-motivated algorithms, including two feedback alignment algorithms and Equilibrium Propagation. In two benchmarks, we find that both of our proposed algorithms yield stable performance and strong generalization compared to other competing back-propagation alternatives when training deeper, highly nonlinear networks, with LRA-E performing the best overall.In experiments on two classification benchmarks, we will show that these two algorithms generalize better than a variety of other biologically motivated learning approaches, all without employing the global feedback pathway required by back-propagation. |
| Researcher Affiliation | Academia | Alexander G. Ororbia Rochester Institute of Technology 102 Lomb Memorial Drive, Rochester, NY, USA 14623 ago@cs.rit.edu Penn State University Old Main, State College, PA 16801 aam35@ist.psu.edu |
| Pseudocode | Yes | Algorithm 1 LRA-E: Target and update computations. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its source code or a link to a code repository. |
| Open Datasets | Yes | MNIST: This dataset 3 contains 28 28 images with grayscale pixel feature values in the range of [0, 255]. ... 3Available at the URL: http://yann.lecun.com/exdb/mnist/. Fashion MNIST: This database (Xiao, Rasul, and Vollgraf 2017) contains 28x28 grey-scale images of clothing items, meant to serve as a much more difficult drop-in replacement for MNIST itself. |
| Dataset Splits | Yes | We create a validation set of 2000 samples from the training split. |
| Hardware Specification | No | The paper does not specify the hardware (e.g., specific GPU/CPU models, memory, or number of machines) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components such as RMSprop and Adam optimizers, but does not specify their version numbers or the versions of underlying libraries/frameworks (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | For both datasets and all models, over 100 epochs, we calculate updates over mini-batches of 50 samples. All feedfoward architectures for all experiments were of either 3, 5, or 8 hidden layers of 256 processing elements. The post-activation function used was the hyperbolic tangent and the top layer was chosen to be a maximum-entropy classifier (i.e., a softmax function). The output layer objective for all algorithms was to minimize the categorical negative log likelihood. For LRA-E, however, we initialized the parameters using a zero-mean Gaussian distribution (variance of 0.05). we used the RMSprop (Tieleman and Hinton 2012) adaptive learning rate with a global step size of λ = 0.001. For Backprop, RFA, DFA, and LRA-E, we were able to use SGD (λ = 0.01). In this paper, β = 0.1, found with only minor prelim. tuning. At the top layer, we can set σL = α (a small, fixed value such as α = 0.01 worked well in our experiments). |