UniBias: Unveiling and Mitigating LLM Bias through Internal Attention and FFN Manipulation
Authors: Hanzhang Zhou, Zijian Feng, Zixiao Zhu, Junlang Qian, Kezhi Mao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments across 12 NLP datasets demonstrate that Uni Bias significantly enhances ICL performance and alleviates prompt brittleness of LLMs. |
| Researcher Affiliation | Academia | Hanzhang Zhou1,2, Zijian Feng1,2, Zixiao Zhu1,2, Junlang Qian1, Kezhi Mao1,2 1Nanyang Technological University 2Singapore-ETH Centre {hanzhang001, feng0119, zixiao001, junlang001}@e.ntu.edu.sg ekzmao@ntu.edu.sg |
| Pseudocode | No | The paper does not contain a structured pseudocode or algorithm block. |
| Open Source Code | Yes | The code is available at https://github.com/hzzhou01/Uni Bias. |
| Open Datasets | Yes | We evaluate our Uni Bias method on 12 diverse natural language processing datasets across various tasks, including sentiment analysis, topic classification, natural language inference, reasoning, and word disambiguation. Statistics and details about the datasets can be found in Table 4 in Appendix. Table 4 lists: SST2 [Socher et al., 2013], MNLI [Williams et al., 2018], Wi C [Pilehvar and Camacho-Collados, 2019], COPA [Roemmele et al., 2011], CR [Hu and Liu, 2004], AGNews [Zhang et al., 2015], MR [Pang and Lee, 2005], RTE [Dagan et al., 2005], SST-5 [Socher et al., 2013], TREC [Voorhees and Tice, 2000], ARC-Challenge [Clark et al., 2018], MMLU [Hendrycks et al., 2020]. |
| Dataset Splits | Yes | In our experiments, we utilize k (where k = 0, 1, 2, 4) training samples per class as prompt examples for k-shot ICL. For testing, we randomly select 2000 samples for MMLU and 3000 samples for MNLI and MR, while employing the original testing sets for other datasets. Additionally, Specifically, we utilize a small subset of training data as a support set, with 20 samples for each class. |
| Hardware Specification | Yes | Additionally, we conduct the experiment on four A5000 GPUs. |
| Software Dependencies | No | The paper mentions models like Llama-2 and GPT-J, but does not provide specific version numbers for software dependencies (e.g., Python, deep learning frameworks, or other libraries). |
| Experiment Setup | Yes | For all experiments, unless stated otherwise, we use 1-shot ICL setting, i.e. one example per class, and repeat five times under different random seeds. We use k = 20 sampes per class as the support set to obtain all threshold values by grid searching, as mentioned in the method section. The prompt template and more implementation details are specified in Appendix A. |