Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron

Authors: Christian Schmid, James M Murray

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Additionally, we verify our approach with real data using the MNIST dataset. We characterize the effects of the learning rule (supervised or reinforcement learning, SL/RL) and input-data distribution on the perceptron s learning curve and the forgetting curve as subsequent tasks are learned.
Researcher Affiliation Academia Christian Schmid Institute of Neuroscience University of Oregon cschmid9@uoregon.edu James M. Murray Institute of Neuroscience University of Oregon jmurray9@uoregon.edu
Pseudocode No The paper provides mathematical derivations and equations but does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes All the source code files can be found in the supplemental materials, which should allow a reader to straightforwardly recreate all experimental results.
Open Datasets Yes Additionally, we verify our approach with real data using the MNIST dataset.
Dataset Splits No The paper refers to training and testing on datasets, including a 'hold-out set' for testing, but does not explicitly specify train/validation/test splits with percentages or counts.
Hardware Specification Yes The computations were performed on an NVIDIA Titan Xp GPU, with runtimes of at most a few minutes.
Software Dependencies Yes The numerical code implementing the model and performing the analyses was mostly written in JAX [Bradbury et al., 2018], as well as Wolfram Mathematica and Sci Py [Virtanen et al., 2020]. (The citation for Sci Py specifically mentions 'Sci Py 1.0').
Experiment Setup Yes For Fig. 2, the flow fields were plotted for the limit of zero input noise and a regularization parameter of λ = 0.1. The learning curves are plotted for λ = 0 and Σ = σ2I, with σ = 0.1 and σ = 1... We set λ = 1. For the forgetting curves in Fig. 6, we set λ = 10 and the learning rate to η = 10^-2... For all other simulations, the learning rate was set to η = 10^-3.