Neural Data Transformer 2: Multi-context Pretraining for Neural Spiking Activity

Authors: Joel Ye, Jennifer Collinger, Leila Wehbe, Robert Gaunt

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We thus develop Neural Data Transformer 2 (NDT2), a spatiotemporal Transformer for neural spiking activity, and demonstrate that pretraining can leverage motor BCI datasets that span sessions, subjects, and experimental tasks. NDT2 enables rapid adaptation to novel contexts in downstream decoding tasks and opens the path to deployment of pretrained DNNs for i BCI control. We focus on offline evaluation on motor applications, demonstrating NDT2 s value in decoding unstructured monkey reaching and human i BCI cursor intent. We also show proof-of-principle real-time cursor control using NDT2 in a human with an i BCI.
Researcher Affiliation Academia 1Rehab Neural Engineering Labs, University of Pittsburgh, 2Neuroscience Institute, Carnegie Mellon University, 3Center for the Neural Basis of Cognition, Pittsburgh, 4Department of Physical Medicine and Rehabilitation, University of Pittsburgh, 5Department of Bioengineering, University of Pittsburgh, 6Department of Biomedical Engineering, Carnegie Mellon University, 7Machine Learning Department, Carnegie Mellon University
Pseudocode No The paper describes the architecture and methods in text and figures, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Code: https://github.com/joel99/context_general_bci
Open Datasets Yes In particular, we focus evaluation on a publicly available monkey dataset, where the subjects performed self-paced reaching to random targets generated on a 2D screen (Random Target Task, RTT) [47] and [47] Joseph E. O Doherty, Mariana M. B. Cardoso, Joseph G. Makin, and Philip N. Sabes. Nonhuman primate reaching with multichannel sensorimotor cortex electrophysiology, May 2017. URL https: //doi.org/10.5281/zenodo.788569.
Dataset Splits No While the paper mentions models were trained to convergence with early stopping, which implies the use of a validation set, it does not explicitly state the percentage or size of a validation split for any of its experiments. It only specifies a "test split".
Hardware Specification Yes Pilot realtime decoding used an NVIDIA 1060/2060...Fitting datasets on the order of 1K trials typically requires 20m-1hr on 12G 1080/2080-series NVIDIA GPUs. 10K-20K trial datasets require 2-8 32G-V100 hours. 100K+ datasets require 72 80G-A100 hours.
Software Dependencies No The paper mentions using "Pytorch implementation of the Transformer layers" but does not specify the version number for PyTorch or any other software dependencies.
Experiment Setup Yes We pretrain with 50% masking and dropout of 0.1. In pretraining we manually tuned LR to 5e 4 in initial experiments. Our base NDT2 uses dropout 0.1, hidden size 256, weight decay 1e 2. In both pretraining and fine-tuning, we scale batch size (...) to be roughly proportional to full dataset so that each epoch requires 10-100 steps