Deep neuroethology of a virtual rodent
Authors: Josh Merel, Diego Aldarondo, Jesse Marshall, Yuval Tassa, Greg Wayne, Bence Olveczky
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then use this platform to study motor activity across contexts by training a model to solve four complex tasks. Using methods familiar to neuroscientists, we describe the behavioral representations and algorithms employed by different layers of the network using a neuroethological approach to characterize motor activity relative to the rodent s behavior and goals. To address these questions, we trained our virtual rodent to solve four complex tasks within a physical environment, all requiring the coordinated control of its body. |
| Researcher Affiliation | Collaboration | 1Deep Mind, London, UK. 2Program in Neuroscience, 3Center for Brain Science, 4Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The rodent will be released as part of dm_control/locomotion. |
| Open Datasets | Yes | We implemented a virtual rodent body (Figure 1) in Mu Jo Co (Todorov et al., 2012), based on measurements of laboratory rats (see Appendix A.1). The rodent will be released as part of dm_control/locomotion. We implemented four tasks adapted from previous work in deep reinforcement learning and motor neuroscience (Merel et al., 2019a; Tassa et al., 2018; Kawai et al., 2015) to encourage diverse motor behaviors in the rodent. |
| Dataset Splits | No | The paper describes the training process for the reinforcement learning agent, including the use of parallel workers and replay buffers, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or sample counts) as commonly seen in supervised learning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions using MuJoCo as a physics engine and refers to DeepMind's dm_control/locomotion, but it does not specify version numbers for these or any other software dependencies needed to replicate the experiment. |
| Experiment Setup | Yes | To train a single policy to perform all four tasks, we used an IMPALA-style setup for actor-critic Deep RL (Espeholt et al., 2018)... To update the actor, we used a variant of MPO (Abdolmaleki et al., 2018)... training the multi-task policies using kickstarting for that task (Schmitt et al., 2018), with a weak coefficient (.001 or .005). We inactivated subsets of neurons by clamping activity to the mean values between the first and second taps. ablation of 64 neurons. |