Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Target Propagation in Recurrent Neural Networks
Authors: Nikolay Manchev, Michael Spratling
JMLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed algorithm is initially tested and compared to BPTT on four synthetic time lag tasks, and its performance is also measured using the sequential MNIST data set. In addition, as TPTT uses target propagation, it allows for discrete nonlinearities and could potentially mitigate the credit assignment problem in more complex recurrent architectures. |
| Researcher Affiliation | Academia | Nikolay Manchev EMAIL Department of Informatics King s College London London, WC2B 4BG, UK Michael Spratling EMAIL Department of Informatics King s College London London, WC2B 4BG, UK |
| Pseudocode | No | The paper describes the proposed algorithm (TPTT) using mathematical equations and textual descriptions, for instance, in Section 2 "Target Propagation Through Time", and illustrates concepts with diagrams (e.g., Figure 2). However, it does not present a distinct block of pseudocode or a clearly labeled algorithm. |
| Open Source Code | Yes | Source code for all experiments is available at https://github.com/nmanchev/tptt |
| Open Datasets | Yes | TPTT was tested on a subset of the pathological synthetic problems initially presented in Hochreiter and Schmidhuber (1997). The network was also tested on a sequence classification task based on the MNIST data set (Le Cun et al. 1998). |
| Dataset Splits | Yes | The MNIST data set was assembled by Le Cun et al. (1998) and is derived from the NIST Special Database 19 (Grother 1995). It contains 60 000 training and 10 000 test images with dimensions of 28x28 pixels each (784 pixels in total). ... The accuracy of the network was measured every 100 iterations on a validation set of 10 000 samples. ... The grid search was carried out using only 10 000 training and 1 000 test images, which were randomly selected from the complete data set. |
| Hardware Specification | No | The paper does not contain any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only discusses the computational aspects of the models. |
| Software Dependencies | No | The paper does not explicitly list any specific software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions) that would be needed to replicate the experiments. |
| Experiment Setup | Yes | The optimisation technique used for training the network was Nesterov s Accelerated Gradient (Nesterov 1983; Bengio et al. 2013) with the momentum µt set to 0.9. ... The network uses a softmax layer as its final layer (see Equation 2), and optimises a cross-entropy cost function... For the adding problem, which requires a real valued output, the last layer of the network is linear, and the cost function the network minimises is the MSE... The number of images per mini-batch was set to 16, the training was capped to 1 000 000 iterations ( 267 epochs), and the number of neurons in the hidden layer was set 100. A grid search was performed for finding optimal αi, αg, and αf... The best accuracy was determined to be produced when using αi = 10 7, αf = 10 2, and αg = 10 8. |