Better Transfer Learning with Inferred Successor Maps
Authors: Tamas Madarasz, Tim Behrens
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first tested the performance of the model in a tabular maze navigation task (Fig. 1e) where both the start and goal locations changed every 20 trials, giving a large number of partially overlapping tasks. The total number of steps taken to complete all episodes, using the best performing setting for each algorithm, is shown in Fig. 2a and Table S1. Figs. 2b and S2 show lengths of individual episodes. We performed two sets of analyses to test if the brain uses similar mechanisms to BSR, comparing our model to experimental data from rodent navigation tasks with changing reward settings. |
| Researcher Affiliation | Academia | Tamas J. Madarasz University of Oxford tamas.madarasz@ndcn.ox.ac.uk Timothy E. Behrens University of Oxford behrens@fmrib.ox.ac.uk |
| Pseudocode | Yes | Our algorithm, the Bayesian Successor Representation (BSR, Fig. 1a, Algorithm 1 in Supplementary Information) extends the successor temporal difference (TD) learning framework in the first instance by using multiple successor maps. |
| Open Source Code | No | The paper does not provide a direct link to open-source code for the methodology described, nor does it explicitly state that the code is publicly released. |
| Open Datasets | No | The paper refers to "experimental data from rodent navigation tasks" from sources [26] and [21], and mentions in acknowledgements that "Roddy Grieves and Paul Dudchenko for generously sharing data from their experiments." However, it does not provide concrete access information (link, DOI, repository) for a publicly available or open dataset that the authors themselves used, beyond citing the original sources where the data might originate. |
| Dataset Splits | No | The paper does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts) for its experiments. It describes experimental settings and parameters but not data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper describes algorithms and methods (e.g., "successor temporal difference (TD) learning framework", "Dirichlet process mixture model"), but it does not provide specific names and version numbers of ancillary software dependencies or libraries used for implementation. |
| Experiment Setup | Yes | We tested each algorithm with different ϵ-greedy exploration rates ϵ [0., 0.05, . . . , 0.35] (after an initial period of high exploration) and SR learning rates αSR [0.001, 0.005, 0.01, 0.05, 0.1]. |