reproducibilityindex.ai

On Measuring and Mitigating Biased Inferences of Word Embeddings

Authors: Sunipa Dev, Tao Li, Jeff M. Phillips, Vivek Srikumar7659-7666

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate a reduction in invalid inferences via bias mitigation strategies on static word embeddings (Glo Ve). Further, we show that for gender bias, these techniques extend to contextualized embeddings when applied selectively only to the static components of contextualized embeddings (ELMo, BERT). Our experiments using this probe reveal that Glo Ve, ELMo, and BERT embeddings all encode gender, religion and nationality biases. We explore the use of a projection-based method for attenuating biases. Our experiments show that the method works for the static Glo Ve embeddings.
Researcher Affiliation	Academia	Sunipa Dev, Tao Li, Jeff M. Phillips, Vivek Srikumar School of Computing University of Utah Salt Lake City, Utah, USA {sunipad, tli, jeffp, svivek}@cs.utah.edu
Pseudocode	No	The paper describes the procedures and calculations in prose and mathematical notation, but does not include any clearly labeled pseudocode blocks or algorithms.
Open Source Code	Yes	the code for template generation and experimental setup is also online2. 2https://github.com/sunipa/On-Measuring-and-Mitigating Biased-Inferences-of-Word-Embeddings
Open Datasets	Yes	More recently, research in this task has been revitalized by large labeled corpora such as the Stanford NLI corpus (SNLI; (Bowman et al. 2015)). We used the Glo Ve vectors pretrained on the common crawl dataset with dimension 300. Our models are trained on the SNLI training set.
Dataset Splits	Yes	Our models are trained on the SNLI training set. The extended version of this paper 1 lists further hyperparameters and network details. SNLI Accuracies (Glo Ve) orig -gen -nat -rel Dev 87.81 88.14 87.76 87.95 Test 86.98 87.20 86.87 87.18
Hardware Specification	No	The paper mentions the use of 'Glo Ve', 'ELMo', and 'BERTBASE' models, but it does not provide any specific details about the hardware used for training or experimentation, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions using specific models and tools like 'Glo Ve', 'ELMo', 'BERTBASE', 'decomposable attention model', and 'Bi LSTM encoder', but it does not specify version numbers for any of these software components or underlying frameworks/libraries.
Experiment Setup	No	The paper states: 'The extended version of this paper 1 lists further hyperparameters and network details.' This indicates that detailed experimental setup information is available, but it is not provided within the main text of the paper.