Inducing brain-relevant bias in natural language processing models

Authors: Dan Schwartz, Mariya Toneva, Leila Wehbe

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that a version of BERT... can improve the prediction of brain activity after fine-tuning. We evaluate the quality of brain predictions made by a particular model by using the brain prediction in a classification task on held-out data...
Researcher Affiliation Academia Dan Schwartz Carnegie Mellon University drschwar@cs.cmu.edu Mariya Toneva Carnegie Mellon University mariya@cmu.edu Leila Wehbe Carnegie Mellon University lwehbe@cs.cmu.edu
Pseudocode No The paper includes diagrams (e.g., Figure 1) describing the model architecture but does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code available at https://github.com/danrsc/bert_brain_neurips_2019
Open Datasets Yes In this analysis, we use magnetoencephalography (MEG) and functional magnetic resonance imaging (f MRI) data recorded from people as they read a chapter from Harry Potter and the Sorcerer s Stone Rowling (1999). The MEG and f MRI experiments were shared respectively by the authors of Wehbe et al. (2014a) at our request and Wehbe et al. (2014b) online1. 1http://www.cs.cmu.edu/~fmri/plosone/
Dataset Splits Yes The f MRI data were recorded in four separate runs in the scanner for each participant... We cross-validate over the f MRI runs. For each f MRI run, we train the model using the examples from the other three runs and use the fourth run to evaluate the model.
Hardware Specification No The paper does not provide specific details regarding the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'the Py Torch version of the BERT code provided by Hugging Face' but does not specify version numbers for PyTorch or other software dependencies.
Experiment Setup Yes In all of our models, we use a base learning rate of 5 10 5. The learning rate increases linearly from 0 to 5 10 5 during the first 10% of the training epochs and then decreases linearly back to 0 during the remaining epochs. We use mean squared error as our loss function in all models.