Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders
Authors: Pengyu Cheng, Weituo Hao, Siyang Yuan, Shijing Si, Lawrence Carin
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On real-world datasets, our Fair Fil effectively reduces the bias degree of pretrained text encoders, while continuously showing desirable performance on downstream tasks. |
| Researcher Affiliation | Academia | Department of Electrical and Computer Engineering, Duke University EMAIL |
| Pseudocode | Yes | Algorithm 1 Updating the Fair Fil with a sample batch |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement) for source code. |
| Open Datasets | Yes | The training corpora consist 183,060 sentences from the following five datasets: Wiki Text-2 (Merity et al., 201y), Stanford Sentiment Treebank (Socher et al., 2013), Reddit (V olske et al., 2017), MELD (Poria et al., 2019) and POM (Park et al., 2014). |
| Dataset Splits | Yes | For the downstream tasks of BERT, we follow the setup from Sent-Debias (Liang et al., 2020) and conduct experiments on the following three downstream tasks: (1) SST-2: A sentiment classification task on the Stanford Sentiment Treebank (SST-2) dataset (Socher et al., 2013)... (2) Co LA: Another sentiment classification task on the Corpus of Linguistic Acceptability (Co LA) grammatical acceptability judgment (Warstadt et al., 2019); and (3) QNLI: A binary question answering task on the Question Natural Language Inference (QNLI) dataset (Wang et al., 2018). |
| Hardware Specification | No | The paper mentions training on BERT but does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | The batch size is set to 128. The learning rate is 1 10 5. We train the fair filter for 10 epochs. |