Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Authors: Pengyu Cheng, Weituo Hao, Siyang Yuan, Shijing Si, Lawrence Carin

ICLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On real-world datasets, our Fair Fil effectively reduces the bias degree of pretrained text encoders, while continuously showing desirable performance on downstream tasks.
Researcher Affiliation Academia Department of Electrical and Computer Engineering, Duke University EMAIL
Pseudocode Yes Algorithm 1 Updating the Fair Fil with a sample batch
Open Source Code No The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for source code.
Open Datasets Yes The training corpora consist 183,060 sentences from the following five datasets: Wiki Text-2 (Merity et al., 201y), Stanford Sentiment Treebank (Socher et al., 2013), Reddit (V olske et al., 2017), MELD (Poria et al., 2019) and POM (Park et al., 2014).
Dataset Splits Yes For the downstream tasks of BERT, we follow the setup from Sent-Debias (Liang et al., 2020) and conduct experiments on the following three downstream tasks: (1) SST-2: A sentiment classification task on the Stanford Sentiment Treebank (SST-2) dataset (Socher et al., 2013)... (2) Co LA: Another sentiment classification task on the Corpus of Linguistic Acceptability (Co LA) grammatical acceptability judgment (Warstadt et al., 2019); and (3) QNLI: A binary question answering task on the Question Natural Language Inference (QNLI) dataset (Wang et al., 2018).
Hardware Specification No The paper mentions training on BERT but does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes The batch size is set to 128. The learning rate is 1 10 5. We train the fair filter for 10 epochs.