Investigating Gender Bias in Language Models Using Causal Mediation Analysis

Authors: Jesse Vig, Sebastian Gehrmann, Yonatan Belinkov, Sharon Qian, Daniel Nevo, Yaron Singer, Stuart Shieber

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In an experiment using several datasets designed to gauge a model s gender bias, we find that gender bias effects increase with larger models, which potentially absorb more bias from the underlying training data.
Researcher Affiliation Collaboration 1 Salesforce Research 2 Harvard University 3 Tel Aviv University
Pseudocode No The paper provides an illustrative example of a calculation in Figure 3, but this is not a generalized pseudocode or algorithm block.
Open Source Code Yes The code for reproducing our results is available at https://github.com/sebastian Gehrmann/Causal Mediation Analysis.
Open Datasets Yes For neuron intervention experiments, we augment the list of templates from Lu et al. (2018) with several other templates, instantiated with professions from Bolukbasi et al. (2016)... For attention intervention experiments, we use examples from Winobias Dev/Test (Zhao et al., 2018a) and Winogender (Rudinger et al., 2018)...
Dataset Splits No For attention intervention experiments, we use examples from Winobias Dev/Test (Zhao et al., 2018a) and Winogender (Rudinger et al., 2018), totaling 160/130 and 44 examples that fit our formulation, respectively.
Hardware Specification No The paper specifies the software models used (GPT2 variants) but does not provide details on the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using GPT2 variants made available by Wolf et al. (2019), but does not provide specific version numbers for the software dependencies like Python, PyTorch, or the `transformers` library itself.
Experiment Setup No The paper describes the models (GPT2 variants) and datasets used, but it does not specify concrete experimental setup details such as learning rates, batch sizes, number of epochs, or optimizer settings.