Towards Understanding and Mitigating Social Biases in Language Models

Authors: Paul Pu Liang, Chiyu Wu, Louis-Philippe Morency, Ruslan Salakhutdinov

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information for highfidelity text generation, thereby pushing forward the performance-fairness Pareto frontier.
Researcher Affiliation Academia 1Carnegie Mellon University. Correspondence to: Paul Pu Liang <pliang@cs.cmu.edu>.
Pseudocode Yes Algorithm 1 AUTOREGRESSIVE INLP algorithm for mitigating social biases in pretrained LMs.
Open Source Code Yes We release our code at https://github.com/pliang279/LM_bias.
Open Datasets Yes We collect a large set of 16,338 diverse contexts from 5 real-world text corpora spanning WIKITEXT-2 (Merity et al., 2017), SST (Socher et al., 2013), REDDIT, MELD (Poria et al., 2019), and POM (Park et al., 2014).
Dataset Splits No The paper describes the datasets used and mentions 'train', 'validation', and 'test' as categories of experimental focus but does not provide specific percentages or sample counts for dataset splits in the main text.
Hardware Specification No The paper acknowledges 'NVIDIA’s GPU support' but does not provide specific details about the GPU models, CPU models, or other hardware specifications used to run its experiments.
Software Dependencies No The paper mentions software like 'GPT-2' and 'Hugging Face' and libraries such as 'GloVe', but it does not provide specific version numbers for any software dependencies.
Experiment Setup No The paper describes aspects of the experimental approach, such as training a bias classifier, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings in the main text.