Action-modulated midbrain dopamine activity arises from distributed control policies

Authors: Jack Lindsey, Ashok Litwin-Kumar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On benchmark navigation and reaching tasks, we show empirically that this model is capable of learning from data driven completely or in part by other policies (e.g. from other brain regions).
Researcher Affiliation Academia Jack Lindsey Department of Neuroscience Columbia University New York, NY jackwlindsey@gmail.com Ashok Litwin-Kumar Department of Neuroscience Columbia University New York, NY a.litwin-kumar@columbia.edu
Pseudocode No The paper includes mathematical equations and a schematic diagram (Figure 1), but no explicit pseudocode or algorithm blocks are presented.
Open Source Code Yes Code for our experiments is provided at https://github.com/jlindsey15/ActionDopamineDistributedControl
Open Datasets No The paper uses simulated tasks ('Open-field navigation', 'Two-joint arm') and describes the setup, but it does not specify any publicly available datasets or provide links/citations for accessing data used in their simulations.
Dataset Splits No The paper mentions optimizing hyperparameters and refers to Appendix A.5 for details, but it does not explicitly state specific training/validation/test dataset split percentages or sample counts in the provided text.
Hardware Specification No The paper states that compute resources and type are included in Appendix A.5, but these specific details are not present in the main body of the provided text.
Software Dependencies No The paper describes the model architecture and algorithms used, but it does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks).
Experiment Setup Yes For both tasks we used a neural network architecture with a single hidden layer... The hidden layer had size 256 and used Re LU nonlinearities. Input weights to the hidden layer were fixed at random (Kaiming uniform) initializations... For all models we optimized the learning rate and magnitude of exploration noise as hyperparameters. For the action surprise model we optimized the coefficient 1 σ2 of the action surprise term as a hyperparameter.