reproducibilityindex.ai

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Authors: Pengyu Cheng, Weituo Hao, Siyang Yuan, Shijing Si, Lawrence Carin

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On real-world datasets, our Fair Fil effectively reduces the bias degree of pretrained text encoders, while continuously showing desirable performance on downstream tasks.
Researcher Affiliation	Academia	Department of Electrical and Computer Engineering, Duke University {pengyu.cheng,weituo.hao,siyang.yuan,shijing.si,lcarin}@duke.edu
Pseudocode	Yes	Algorithm 1 Updating the Fair Fil with a sample batch
Open Source Code	No	The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for source code.
Open Datasets	Yes	The training corpora consist 183,060 sentences from the following ﬁve datasets: Wiki Text-2 (Merity et al., 201y), Stanford Sentiment Treebank (Socher et al., 2013), Reddit (V olske et al., 2017), MELD (Poria et al., 2019) and POM (Park et al., 2014).
Dataset Splits	Yes	For the downstream tasks of BERT, we follow the setup from Sent-Debias (Liang et al., 2020) and conduct experiments on the following three downstream tasks: (1) SST-2: A sentiment classiﬁcation task on the Stanford Sentiment Treebank (SST-2) dataset (Socher et al., 2013)... (2) Co LA: Another sentiment classiﬁcation task on the Corpus of Linguistic Acceptability (Co LA) grammatical acceptability judgment (Warstadt et al., 2019); and (3) QNLI: A binary question answering task on the Question Natural Language Inference (QNLI) dataset (Wang et al., 2018).
Hardware Specification	No	The paper mentions training on BERT but does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	The batch size is set to 128. The learning rate is 1 10 5. We train the fair ﬁlter for 10 epochs.