Direct Feedback Alignment Provides Learning in Deep Neural Networks
Authors: Arild Nøkland
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that the test performance on MNIST and CIFAR is almost as good as those obtained with back-propagation for fully connected networks. |
| Researcher Affiliation | -1 | Arild Nøkland Trondheim, Norway arild.nokland@gmail.com. The paper provides the author's name, city, country, and a personal email address. It does not provide an explicit institutional affiliation (university or company name), which makes it impossible to definitively classify as Academia, Industry, or Collaboration based on the given criteria. |
| Pseudocode | No | The paper provides mathematical equations for calculations and updates, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block with structured steps. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the methodology, nor does it provide any links to a code repository. |
| Open Datasets | Yes | To investigate if DFA learns useful features in the hidden layers, a 3x400 tanh network was trained on MNIST with both BP and DFA. Several feed-forward networks were trained on MNIST and CIFAR to compare the performance of DFA with FA and BP. The method was able to fit the training set on all experiments performed on MNIST, Cifar-10 and Cifar-100. |
| Dataset Splits | No | For the MNIST dropout experiments, learning rate with decay and training time was chosen based on a validation set. While a validation set is mentioned, specific details such as exact split percentages or sample counts for the validation set are not provided. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions that experiments were 'optimized with RMSprop [14]', but it does not provide specific version numbers for any software components, libraries, or programming languages used. |
| Experiment Setup | Yes | Training was stopped when training error reached 0.01% or the number of epochs reached 300. A mini-batch size of 64 was used. No momentum or weight decay was used. The input data was scaled to be between 0 and 1, but for the convolutional networks, the data was whitened. For FA and DFA, the weights and biases were initialized to zero, except for the Re LU networks. For BP and/or Re LU, the initial weights and biases were sampled from a uniform distribution in the range [ 1/ fanin, 1/ fanin]. The random feedback weights were sampled from a uniform distribution in the range [ 1/ fanout, 1/ fanout]. |