Action-modulated midbrain dopamine activity arises from distributed control policies
Authors: Jack Lindsey, Ashok Litwin-Kumar
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On benchmark navigation and reaching tasks, we show empirically that this model is capable of learning from data driven completely or in part by other policies (e.g. from other brain regions). |
| Researcher Affiliation | Academia | Jack Lindsey Department of Neuroscience Columbia University New York, NY jackwlindsey@gmail.com Ashok Litwin-Kumar Department of Neuroscience Columbia University New York, NY a.litwin-kumar@columbia.edu |
| Pseudocode | No | The paper includes mathematical equations and a schematic diagram (Figure 1), but no explicit pseudocode or algorithm blocks are presented. |
| Open Source Code | Yes | Code for our experiments is provided at https://github.com/jlindsey15/ActionDopamineDistributedControl |
| Open Datasets | No | The paper uses simulated tasks ('Open-field navigation', 'Two-joint arm') and describes the setup, but it does not specify any publicly available datasets or provide links/citations for accessing data used in their simulations. |
| Dataset Splits | No | The paper mentions optimizing hyperparameters and refers to Appendix A.5 for details, but it does not explicitly state specific training/validation/test dataset split percentages or sample counts in the provided text. |
| Hardware Specification | No | The paper states that compute resources and type are included in Appendix A.5, but these specific details are not present in the main body of the provided text. |
| Software Dependencies | No | The paper describes the model architecture and algorithms used, but it does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | For both tasks we used a neural network architecture with a single hidden layer... The hidden layer had size 256 and used Re LU nonlinearities. Input weights to the hidden layer were fixed at random (Kaiming uniform) initializations... For all models we optimized the learning rate and magnitude of exploration noise as hyperparameters. For the action surprise model we optimized the coefficient 1 σ2 of the action surprise term as a hyperparameter. |